NewBase60



New Base 60

 
A base 60 numbering system using only ASCII numbers and letters.
 or
a side effect of building a personal URL shortener
 
Tantek Çelik, 2009-167
shortlink: http://ttk.me/w/NewBase60 (was tr.im/base60)
 

Introduction

 
I needed a way to compress numbers for a URL shortener I am building so I looked at existing work, decided I could do better with a better design methodology, and ended up deriving a base 60 numbering system from ASCII characters.
 

URL shortening

 
URL shorteners have grown quite popular in recent months, due mostly to Twitter and other microblogging services (Identica, FriendFeed) limiting posts to 140 characters.
 
However, 3rd party URL shortening services inevitably make links more fragile and vulnerable to a single point of failure due to maintenance or being hacked. The only work around for this problem is for sites to host their own URL shorteners, as http://flickr.com has done with http://flic.kr/ URLs.
 
Typical URL shorteners take an arbitrary URL and return a short URL, usually on a shorter domain, followed by a seemingly random set of alphanumeric digits that usually indicate an id of an entry in a database which has the original URL.
 
Thus to develop my own URL shortener I realized first I needed to come up with a good way to maximally compress a number (or a few) into characters in a URL.  Update: see Whistle - my algorithmically reversible shortener that uses NewBase60.
 
While many existing shorteners use numbers and letters, I found a few flaws and vulnerabilities in their choice of what set of numbers and letters to use and decided to use a methodology based on usability and robustness to deliberately choose which characters to use.  
 

Methodology

 
 

Derivation

 
 

New Base 60 numbering system

 
The complete numerical sexagesimal sequence grouped:
 
The complete sequence written out in rows of 10 for readability:
 
   0 1 2 3 4 5 6 7 8 9
 0 0 1 2 3 4 5 6 7 8 9 
10 A B C D E F G H J K
20 L M N P Q R S T U V
30 W X Y Z _ a b c d e
40 f g h i j k m n o p
50 q r s t u v w x y z

Open Source implementation

The following CASSIS V0 code (runs in PHP and javascript with the inclusion of cassis0.js and cassis0php.js from the CassisProject) is licensed under a Creative Commons Attribution Share-Alike license (for now). Please attribute to "Tantek Çelik" and link attribution to http://tantek.com
 

Cassis V0 code

runs in both PHP and javascript

FOR THE LATEST NewBase60 Cassis.js CODE SEE:

NOTE: THE BELOW CODE IS AN ORIGINAL PROOF OF CONCEPT AND HERE FOR HISTORICAL PURPOSES ONLY:

 

function num_to_sxg($n) {
 $s = "";
 $m = "0123456789ABCDEFGHJKLMNPQRSTUVWXYZ_abcdefghijkmnopqrstuvwxyz";
 if ($n===undefined || $n===0) { return 0; }
 while ($n>0) {
   $d = $n % 60;
   $s = strcat($m[$d],$s);
   $n = ($n-$d)/60;
 }
 return $s;
}

function num_to_sxgf($n, $f) {
 $s = num_to_sxg($n);
 if ($f===undefined) { 
   $f=1; 
 }
 $f -= strlen($s);
 while ($f > 0) { 
   $s = strcat("0",$s); 
   --$f; 
 }
 return $s;
}

function sxg_to_num($s) {
 $n = 0;
 $j = strlen($s);
 for ($i=0;$i<$j;$i++) { // iterate from first to last char of $s
   $c = ord($s[$i]); //  put current ASCII of char into $c  
   if ($c>=48 && $c<=57) { $c=$c-48; }
   else if ($c>=65 && $c<=72) { $c-=55; }
   else if ($c==73 || $c==108) { $c=1; } // typo capital I, lowercase l to 1
   else if ($c>=74 && $c<=78) { $c-=56; }
   else if ($c==79) { $c=0; } // error correct typo capital O to 0
   else if ($c>=80 && $c<=90) { $c-=57; }
   else if ($c==95) { $c=34; } // underscore
   else if ($c>=97 && $c<=107) { $c-=62; }
   else if ($c>=109 && $c<=122) { $c-=63; }
   else { $c = 0; } // treat all other noise as 0
   $n = 60*$n + $c;
 }
 return $n;
}

 

Additional Implementations

Ordered by date of first publication, earliest first.

Java and partial Python

On 2010-019 Faruk Akgul published his Java and partial Python implementations of NewBase60:

POSIX Shell

On 2010-078 Stephen Paul Weber published his POSIX Shell implementation of NewBase60:

Perl

On 2010-148 Steve Ivy published his Perl implementation of NewBase60:

Ruby

On 2010-152 Shane Becker published his Ruby implementation of NewBase60:

CommonJS and Nodejs aware Javascript

On 2010-224 Edward O'Connor (hober) published his CommonJS- and Node.js-aware Javascript version of NewBase60:

Fortran

Python

On 2015-027 Kevin Marks published a CC0 Python version of NewBase60 (based on Faruk Akgul's earlier partial Python version) that decodes as well as encoding:

Rust

On 2021-085 astralbijection published a GPL 3.0 Rust version of NewBase60:

 

References

 

See also

 

Related

 

Trivia


Return to MyNextStartup \ ProjectsList \ FrontPage.