base conversion program.

Discussion in 'Physics & Math' started by leopold, Dec 24, 2014.

  1. leopold Valued Senior Member

    Messages:
    17,455
    i've been thinking about writing a base conversion program and i need a little help.
    this program will have 2 inputs and 1 output.
    the first input is the concern.
    it will ask the question of which base you want to convert from and convert to.
    how should i implement this question?
    should the inputs be the actual base?
    for example:
    you want to convert from base 16 to base 10
    do you input F,10 or 16,10?
    which would be easier?
    this question needs answered before i can go ahead with the program because computers treat numbers and letters differently.

    for the curious:
    the maximum base size would be 255 but the first 31 ascii codes are control codes and can't be used.
    ascii codes 132 and greater can't be input by the keyboard.
    that leaves about 120 for max base size.
    i've decides to use 63 as max base size because of ease of use.
    bases will start at 2 and use all the digits, then all uppercase letters, then all lower case letters.

    i can get bigger base sizes (most likely unlimited) but the output will not decode into single characters.
     
  2. Google AdSense Guest Advertisement



    to hide all adverts.
  3. Dinosaur Rational Skeptic Valued Senior Member

    Messages:
    4,885
    A radix larger than 20 is unwieldy. While the Babylonians & perhaps other cultures used a radix 60 system, it is awkward due to the large number of digits required. The Babylonians used base 60 because it has so many integer factors, which provides simple expressions for commonly used fractions.

    BTW: I have no knowledge of Babylonian notation for fractions. They might have had some awkward notation, unlike that used in modern times.

    For human use, you need to memorize all the sums & products from 1 + 1 to 59 + 59 and 1*1 to 59*59.

    If you use a computer to do all the work, you still need to memorize 60 digit values from 00 to 59,

    The above seems like a lot of memory effort for only a little extra convenience (merely more fractions with a finite number of digits).

    What language do you expect to use? I have been using Visual Basic net, which is user friendly & has an excellent IDE.

    Radix conversion is a trivial computational task, which requires very little effort to implement in most any language.

    BTW: I do not call myself Dinosaur because I am a multi-ton reptile-like creature. I use that alias because I did mathematics in the prehistoric era prior to computers and did a lot of work with slide rules. I used an obscure circular slide rule called a Sexton's Omnimiter, which had many advantages over other slide rules.

    Have you ever considered using what is sometimes called a balanced radix notation for an odd radix? It is very handy for radix three, using (- 0 +) for digits. The first few values are as follows
    Code:
    000
    00+
    0+-
    0+0
    0++
    +--
    +-0
    +-+
    +0-
    +00
    +0+
    ++-
    ++0
    +++
    For a brief time a company I worked for in the 1950's thought they could develop a basic electronic component with three stable states. Balanced Radix three would be the best notation for such a system. They soon discovered that they could not develop such a basic component & Radix three notation bit the dust. We joked that the project was abandoned because the digits would be called tits & the female population would be offended.
     
  4. Google AdSense Guest Advertisement



    to hide all adverts.
  5. original sine Registered Senior Member

    Messages:
    924
    I would use integers to represent the inputs. It is easier to understand "base 20" compared to "base J". Also, the input can be independent from the counting system of the program (maybe J represents base 46 if the program changes to digits->lower->upper, but just giving the input of 46 will produce the expected results). Other than that, it's just a matter of preference as it is trivial to convert between a character and an integer.

    Dinosaur, if I understand that Balanced Radix notation correctly, +-- is 5, which is 3^2 - 3^1 - 3^0. Then I assume there is still a "signed tit", so that -++ is equal to -5.
     
  6. Google AdSense Guest Advertisement



    to hide all adverts.
  7. leopold Valued Senior Member

    Messages:
    17,455
    the program will not explain how it's done, it will only display the results.
    the human operator must learn only what the symbols mean.
    that's the major reason i selected numbers and letters, in order, instead of obscure characters.
    ease of memorization.
    2-9, A-Z, a-z.
    that would be a max base of 2*26+8.
    it will give me something to do.
    straight basic.
    i have visual studio and about 5 or 6 support CDs for it.
    i dunno, i'm just a nostalgic guy i guess.
    i would like to be able to do a direct conversion, instead of converting to base 10 then to the base you entered.
    not to my knowledge, although i'm familar with different algorithms.
    the code has to work for any integer value.
     
  8. leopold Valued Senior Member

    Messages:
    17,455
    there are 2 things i have to keep in mind.
    1. the output must decode into single "digits".
    2. i have to make it somewhat easy to remember what the codes mean.

    this effectively limits me to base 60.
    anything much over that and you will need a decoding chart to explain the symbols.
     
  9. rpenner Fully Wired Valued Senior Member

    Messages:
    4,833
    One proposal for base-60 digits:

    i: x mod 5 = 0, u: x mod 5 = 1, e: x mod 5 = 2, o: x mod 5 = 3, a: x mod 5 = 4,
    z: x mod 4 = 0, l: x mod 4 = 1, m: x mod 4 = 2, d: x mod 4 = 3,
    p: x mod 3 = 0, n: x mod 3 = 1, t: x mod 3 = 2.

    Code:
    three = [ 'p', 'n', 't' ]
    four = [ 'z', 'l', 'm', 'd' ]
    five = ['i', 'u', 'e', 'o', 'a' ]
    def digit(n) :
        return four[n%4] + five[n%5] + three[n%3]
    
    def digits(n):
         l = []
         while n >= 60:
             l.append(digit(n))
             n /= 60
         l.append(digit(n))
         l.reverse()
         return ' '.join(l)
    
    \(\begin{array}{r|rrrrrr} + & 0 & 10 & 20 & 30 & 40 & 50 \\ \hline 0 & zip & min & zit & mip & zin & mit \\ 1 & lun & dut & lup & dun & lut & dup \\ 2 & met & zep & men & zet & mep & zen \\ 3 & dop & lon & dot & lop & don & lot \\ 4 & zan & mat & zap & man & zat & map \\ 5 & lit & dip & lin & dit & lip & din \\ 6 & mup & zun & mut & zup & mun & zut \\ 7 & den & let & dep & len & det & lep \\ 8 & zot & mop & zon & mot & zop & mon \\ 9 & lap & dan & lat & dap & lan & dat \end{array}\)

    \(\begin{array}{c|ccccccccc} + & zip & lun & met & dop & zan & lit & mup & den & zot & lap & min \\ \hline zip & zip & lun & met & dop & zan & lit & mup & den & zot & lap & min \\ lun & lun & met & dop & zan & lit & mup & den & zot & lap & min & dut \\ met & met & dop & zan & lit & mup & den & zot & lap & min & dut & zep \\ dop & dop & zan & lit & mup & den & zot & lap & min & dut & zep & lon \\ zan & zan & lit & mup & den & zot & lap & min & dut & zep & lon & mat \\ lit & lit & mup & den & zot & lap & min & dut & zep & lon & mat & dip \\ mup & mup & den & zot & lap & min & dut & zep & lon & mat & dip & zun \\ den & den & zot & lap & min & dut & zep & lon & mat & dip & zun & let \\ zot & zot & lap & min & dut & zep & lon & mat & dip & zun & let & mop \\ lap & lap & min & dut & zep & lon & mat & dip & zun & let & mop & dan \\ min & min & dut & zep & lon & mat & dip & zun & let & mop & dan & zit \end{array}\) AND \(\begin{array}{c|ccccccccc} \times & zip & lun & met & dop & zan & lit & mup & den & zot & lap & min \\ \hline zip & zip & zip & zip & zip & zip & zip & zip & zip & zip & zip & zip \\ lun & zip & lun & met & dop & zan & lit & mup & den & zot & lap & min \\ met & zip & met & zan & mup & zot & min & zep & mat & zun & mop & zit \\ dop & zip & dop & mup & lap & zep & dip & mop & lup & zap & dep & mip \\ zan & zip & zan & zot & zep & zun & zit & zap & zon & zet & zup & zin \\ lit & zip & lit & min & dip & zit & lin & mip & dit & zin & lip & mit \\ mup & zip & mup & zep & mop & zap & mip & zup & mep & zop & map & lun \, zip \\ den & zip & den & mat & lup & zon & dit & mep & lan & zut & lun \, dop & lun \, min \\ zot & zip & zot & zun & zap & zet & zin & zop & zut & lun \, zan & lun \, zep & lun \, zit \\ lap & zip & lap & mop & dep & zup & lip & map & lun \, dop & lun \, zep & lun \, lup & lun \, mip \\ min & zip & min & zit & mip & zin & mit & lun \, zip & lun \, min & lun \, zit & lun \, mip & lun \, zin \end{array}\)

    "lon don" = 13 × 60 + 43 = 823.
    "zip point zot man let zot man let ..." = "zot man let / dat dat dat" = "lun / den" = 1/7
     
    Last edited: Dec 27, 2014
  10. leopold Valued Senior Member

    Messages:
    17,455
    resorting to a decoding chart is something i wish to avoid.
    i also wish to avoid multiple character "digits".
    if i resorted to a three character "digit" i could effectively get a max base of 216000 (60 cubed) using the symbols i introduced.

    would such a base be helpful, other than curiosity?

    a 2 character "digit" (max base 60 squared) might be okay but a 3 character "digit" is just a tad extreme in my opinion.
     
  11. leopold Valued Senior Member

    Messages:
    17,455
    something that might be useful:
    include an option for the user to define their own symbols for base ten.
    this might give some insights to pi and prime numbers.
     
  12. leopold Valued Senior Member

    Messages:
    17,455
    i've got the first part done, converting from base 10 to any base up to 60.
    the attached file contains the code i wrote and the algorithm i used.
    you can open the .BAS file with notepad.
     

    Attached Files:

  13. rpenner Fully Wired Valued Senior Member

    Messages:
    4,833
    Leopold's Basic program:
    Code:
     
    10 ' base conversion
    20 DIM C$(60)
    30 DIM C(60)
    40 FOR D=0 TO 59
    50 READ C$(D)
    60 NEXT D
    70 DATA 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J
    80 DATA K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d
    90 DATA e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x
    100 '
    110 '
    120 '
    130 INPUT"NUMBER TO CONVERT";N:N1=N
    140 IF N<>INT(N) THEN PRINT"MUST BE INTEGER": GOTO 130
    150 IF N<1 THEN PRINT"MUST BE GREATER THAN 0":GOTO 130
    160 INPUT"BASE CONVERTING TO";OB
    170 IF OB<>INT(OB) THEN PRINT"MUST BE INTEGER":GOTO 160
    180 IF OB>60 THEN PRINT"MUST BE LESS THAN 61":GOTO 160
    190 IF OB<2 THEN PRINT"MUST BE GREATER THAN 1":GOTO 160
    200 R=1/OB
    210 C=0
    220 X=N/OB
    230 N=INT(X)
    240 D=(X-N)/R
    250 C(C)=D:C=C+1
    260 IF N>OB-1 THEN 220
    270 C(C)=N
    280 IF N=0 THEN C=C-1
    290 PRINT:PRINT N1;"BASE 10 EQUALS"
    300 FOR Y=C TO 0 STEP-1
    310 PRINT C$(C(Y));
    320 NEXT Y:PRINT" BASE";OB:PRINT:GOTO 130
    Here's an alternate proposal in python which is coded to function both as a library of useful subroutines but also as a main program. Notice how delegating to classes and subroutines allows one to focus on just the part of the program one is interested in. It also contains a converter that works in both directions and built-in help for the class and functions.
    Code:
    #!/usr/bin/python
    
    import sys
    
    def char_range(lower_bound, upper_bound):
        """Returns list of characters inclusively between the bounds."""
        return map( chr, range( ord(lower_bound), 1 + ord(upper_bound) ) )
    
    main_alphabet = char_range('0', '9') + char_range('A', 'Z') + char_range('a', 'x')
    
    class Converter:
        """A simple base coverter for non-negative integers."""
        def __init__(self, alphabet=None):
            """Create new base converter, alphabet (if specified) must
            have unique letters."""
            if alphabet == None :
                alphabet = main_alphabet
            if len(alphabet) < 2 :
                raise RuntimeError("alphabet must have at least 2 letters")
            d = dict()
            for i in range(len(alphabet)):
                if d.has_key(alphabet[i]) :
                    raise RuntimeError("alphabet must not have duplicate letters")
                d[alphabet[i]] = i
            self.letters = alphabet[:]
            self.lookup = d
        def decode(self, digits):
            """Decodes a string consisting of letters in alphabet as an integer."""
            sum = 0
            base = len(self.letters)
            for letter in digits:
                if not self.lookup.has_key(letter):
                    raise RuntimeError("alphabet does not contain this letter: "
                    + str(letter) )
                sum = sum * base + self.lookup[letter]
            return sum
    
        def encode(self, number):
            """Encode a number in alphabet, returns a string."""
            list = []
            base = len(self.letters)
            while number >= base:
                list.append( self.letters[ number % base ] )
                number /= base
            list.append( self.letters[number] )
            list.reverse()
            return "".join(list)
    
    def input_integer(prompt, lower_bound=None, upper_bound=None):
        """"Prompt for integer from user with optional upper an lower bounds."""
        while 1:
            print prompt,
            n = sys.stdin.readline().strip()
            try:
                n = int(n)
            except ValueError:
                print "This does not look like an integer to me:", n
                continue
            if lower_bound != None and n < lower_bound:
                print "The number should be larger than", lower_bound
                continue
            if upper_bound != None and n > upper_bound:
                print "The number should be less than", upper_bound
                continue
            break
        return n
    
    def leopold_loop():
        """Main program."""
        try:
            while 1:
                n = input_integer("Number to convert: ", 0)
                ob = input_integer("Base converting to: ", 2, len(main_alphabet))
                short_alphabet = main_alphabet[:ob]
                x = Converter(short_alphabet)
                print n, "base 10 equals", x.encode(n), "base", ob
        except KeyboardInterrupt:
            pass
    
    # Check to see if this is being called as a main program or just
    # being used as a library and if this is a main program, run the
    # interactive leopold_loop.
    
    if __name__ == "__main__" :
        leopold_loop()
    
     
  14. Stryder Keeper of "good" ideas. Valued Senior Member

    Messages:
    13,102
    If you are wanting to build a base conversion for say dealing with a password system, you should consider that the current weakness that can be seen is that the lesser base value can fit into the larger secondary value but it can not be said the same of the inverse. This reduces what workload someone would have to do to bruteforce program if it's intention is to be involved in passwords.

    What my suggestion is that rather than being limited to the base conversion only being able to handle one specific way, you might consider looking at how to generate an inverse method, whereby in the case that the lower base is limited (lets say like and 8bit architecture versus that of something higher) you can split the base into clusters that equal no more than it's limited base, it's just naming the base clusters differently to reflect that (These of course shouldn't be constant handles, considering the nature of cryptology.)
     
  15. leopold Valued Senior Member

    Messages:
    17,455
    we each have our programming style, and i've always preferred programs that are straight forward.
    OTOH subroutines play a valuable role too.
    especially useful is the ability to write a series of stand alone code and call each of them from a main program of subroutine calls.
    other programs don't lend themselves well to subroutine calls, because they are too small.
    leopold loop?
    i'm honored.

    i've been thinking about using python but it seems more complicated than basic.
    plus, i've always had problems with programs that have no line numbers.

    now comes the hard part, converting from any base to ten.
    i'm going to have to deal with strings, how else can i input 10 and above?
    i assume the algorithm will be basically the same, and since THAT is true brings up the question of a direct convert.
    a solution might become apparent after i write the code.
     
  16. leopold Valued Senior Member

    Messages:
    17,455
    i have no intentions for the code except to write it.
    it's a real shame basic fell into oblivion, the syntax is really easy to understand.
    i do wish the RND function was better.

    i'm going to write a prime number generator.
    that will give me some kind of yardstick to measure execution speeds between pcbasic and GWbasic.

    i also have a few other programs in mind.
     
  17. rpenner Fully Wired Valued Senior Member

    Messages:
    4,833
    Typically, you don't convert to base-10 -- you convert to an integer which for most computers has an internal base-2 representation.

    As you see in the code above, my Converter class works both ways.

    Code:
    def decode(self, digits):
        """Decodes a string consisting of letters in alphabet as an integer."""
        sum = 0
        base = len(self.letters)
        for letter in digits:
            if not self.lookup.has_key(letter):
                raise RuntimeError("alphabet does not contain this letter: "
                + str(letter) )
            sum = sum * base + self.lookup[letter]
    return sum
    
    In the above, I used the programming structures of python to easily loop over the letters of a string and ensure that my lookup fails if the character does not belong in a base-whatever representation. But in basic, much of the same can be done:

    • Setup:
      • Initialize an array lookup[x] = -1 for x from 0 to 255
      • Set lookup[ord(letter)] = index for index from 0 to 59; "ord" might be called "asc"
    • For each pair of rep$ and base:
      • Initialize sum to 0
      • Set number of problems to 0
      • Loop over the letters in rep$ from left to right:
        • Convert the letter to it's numeric value
        • Mark it as a problem if value < 0 or value > 255 or lookup[value] < 0 or lookup[value] >= base
        • If not a problem, then sum := sum * base + lookup[value]
      • If there were no problems, sum has the value of the string.

    But you will want to read the documentation on MID$ and ORD/ASC yourself.

    One tricky part is how numbers are represented in your basic; typically integers have some maximum values and floating point numbers are approximate with some maximum precision at reproducing large integers.

    In base 60 UUR953oeV0G = 2^64, rbZWMThaW = 2^53, 5VO6SG = 2^32, ICG = 2^16 some of which may decode as zero.
    5VO6QD might decode as -123 if you are using base-32 integer math, while if you are using IEEE (approximate) floating point numbers, rbZWMThaW and rbZWMThaX might decode to the same number.

    As it so happens, Python is happy with exact integers of arbitrary magnitude, so none of these present a problem.
     
    Last edited: Jan 2, 2015
  18. leopold Valued Senior Member

    Messages:
    17,455
    here is the completed program, with a curious problem
    i haven't done extensive testing but i did find a bug.
    reproducing the error:
    i/o base= 35,10
    input Y, output 34.
    input 10, output 1.
    input 11, output 36.

    i have no idea what is causing this.
    putting a breakpoint at line 200 proves this error is coming from the first part of the program (lines 130-190)
     

    Attached Files:

  19. rpenner Fully Wired Valued Senior Member

    Messages:
    4,833
    I bet if you input 10,10,12345 you get 54321 as an output.

    Most of your problems are on line 180. You write:
    Code:
    130 L=LEN(N$):L1=0:N=0
    140 L1=L1+1:IF L1>L THEN 200
    150 C$=MID$(N$,L1,1):C=-1
    160 C=C+1:IF C>IB-1 THEN 190
    170 IF C$<>C$(C)THEN 160
    180 N=N+(C*(IB^(L1-1))):GOTO 140
    190 PRINT"INVALID NUMBER FOR BASE";IB:GOTO 120
    200 '
    
    • Line 130 :
      • N$ is the input string of digits,
      • IB is the base for conversion,
      • C$() is an array variable containing single digits and is indexed by their value, so C$(0) = "0" and C$(59) = "x".
      • L is the number of characters in N$,
      • L1 is the position in N$ we are examining, which starts at one less than the index of the first character, 1
      • N is the running total which will become our conversion, which starts at zero
    • Line 140: Between here and line 180, L1 loops from 1 to L, when the loop is done we escape to line 200
    • Line 150:
      • C$ is the character at the L1[sup]th[/sup] position of N$,
      • C is the as-yet-unknown numeric value of this digit which is initialized to one less than the smallest possible digit, 0
    • Line 160: between here and line 180, C loops from 0 to IB-1 but if the loop reaches completion we jump to line 190
    • Line 170: if the C$ is not the character that is a digit that corresponds to the value C, then we continue the loop to look for the next higher value of C, if any
    • Line 180: Update N by adding the value of the digit times IB raised to the power of how far we have moved left-to-right (minus one). Then continue the loop at line 140
    • Line 190: Complain because for some value of L1, the character C$ is not a digit with a value between 0 and IB-1.
    Look at the successive values of IB^(L1-1) for a 3 digit number: 1, IB, IB^2 , when it should be IB^2, IB, 1.
    Line 180 should be changed to EITHER
    Code:
    180 N=N+(C*(IB^(L-L1))):GOTO 140
    or
    Code:
    180 N=N*IB+C:GOTO 140
    .
     
  20. leopold Valued Senior Member

    Messages:
    17,455
    yep, which is curious.
    i specifically added the y loop because i wanted the MSD on the left.
    IB is the base you are converting from.
    the number you input for conversion must be in base IB
    i also use L1 for IB exponentiation.
    yep.
    when both loops are done, N is the base 10 equivalent of N$
    that sounds right, N=N+(C*(IB^(L-L1)))

    EDIT:
    YOU ARE JUST TOO SUAVE RPENNER.
    oops, guess you can tell where i've been.

    Please Register or Log in to view the hidden image!

     
    Last edited: Jan 2, 2015
  21. Dinosaur Rational Skeptic Valued Senior Member

    Messages:
    4,885
    Original: In Post #2, your represention of plus & minus 5 are correct. Note that there is no extra symbol to indicate positive & negative. In balanced notation, a number is positive if the first digit is positive & negative if the first digit is negative.
     
  22. leopold Valued Senior Member

    Messages:
    17,455
    here is a variant on the program i wrote.
    it asks you for an input base and the number you want to convert.
    it then generates the result in each base up to 60.
    the screen format screws up when printing some conversions because i don't know how many characters will be required for each number.
    the output format is xxx( y.
    xxx is the converted number
    y is the base
     

    Attached Files:

Share This Page