representing a number

We know how to store individual 1s and 0s on our disk. But those aren’t particularly meaningful by themselves. We’re now going to look at how we can represent a number in .

You already know several ways of representing a number:

1oneIununo
2twoIIdeuxdos
3threeIIItroistres

These representations are arbitrary. There’s nothing special about the letters t h r e e that means this many things: marblemarblemarble.

So let’s invent our own mini language for representing numbers. Let’s call it “roonish”.

note

This isn’t a programming language — it’s a regular language, like English.

Words in English are composed of the letters A, B, C, etc. Words in roonish will be composed of the “letters” 0 and 1.

Let’s come up with some arbitrary words in roonish:

Englishroonish
cat100100011001
apple000000101101
house0101001011110
one1010
two111101110011110
three1111
one million, six hundred and three001

Remember, we can spell these words however we like — these are all equally valid.

… but they’re not equally convenient. Common concepts like “cat” or “two” should be something short and easy, not a long string like 111101110011110. Similarly, it’s silly for “one million, six hundred and three” to be a short word like 001, when we not going to use it often.

To simplify things, let’s forget about words like “cat” or “house” for now. Instead, we’re only going to define roonish words for numbers like “three” or “twenty-seven”, not arbitrary nouns

What we want is a scheme that will give us a sensible spelling for each number.

base ten

To come up with our scheme, let’s take inspiration from the way we normally write numbers — e.g. 53, 1,027, etc. This system is called base ten, which means:

  • We use ten digits (0, 1, 2, …, 8, 9)
  • The columns get bigger by a factor of ten from right to left.

For example, 234 means:

columnvalueamount
100s2+200
10s3+30
1s4+4

This scheme has some nice properties:

  • We can extend it forever — there’s no maximum limit
  • Small numbers like 3 are easy to write
  • As a bonus, it makes it easy to do things like addition, multiplication etc

There’s one problem: base ten uses the digits from 0-9, while roonish spells its words using only 1s and 0s. roonish can’t represent the digits 2-9. Therefore, we need to modify the scheme.

adapting the scheme

Base ten isn’t unique. We can actually write numbers in any base. For example, we could use base two instead, which means:

  • We use two digits (0 and 1)
  • The columns get bigger by a factor of two.

Instead of the columns going 1, 10, 100 etc, they go 1, 2, 4, 8, doubling each time. So 10011 means:

columnvalueamount
16s1+16
8s0+0
4s0+0
2s1+2
1s1+1

Base two keeps all the nice properties of base ten — we can extend it forever, and it gives us sensible names for the numbers. And because it only uses 1s and 0s, it’s a perfect fit for roonish.

note

Base 2 is sometimes referred to as binary. However, binary can also refer to any data that uses 1 and 0, whether it uses base 2 or not.

base 2 dictionary

So if we’re using base 2 as the scheme for our dictionary, our numbers end up looking like this:

Englishroonish
zero0
one1
two10
three11
four100
five101
six110
seven111
eight1000

Let’s apply this to a disk:

00/0000/00
loading…

twenty two

  • The disk has the pattern marbleholemarblemarblehole
  • … which represents the string 10110
  • … which encodes the base two number 10110
  • … which equals the number known in English as “twenty-two”.

In other words, we’ve stored the number “twenty-two” on our disk in a way that both we and can understand.

But what if we want to store things other than numbers? Next we’ll look at extending our scheme to include text.

continue