Home

Ruby

This is the beginning of my Ruby Tutorial. Either read this, or head over to the Programming Exercises and start solving them using Ruby.

Getting started with Ruby

I have started to learn Ruby at Try Ruby Here is what I've learned so far.

$ ruby --version
ruby 2.0.0p481 (2014-05-08 revision 45883) [universal.x86_64-darwin14]

The first thing we should try is printing to the screen. I've created a file with .rb extension. Not that it matters a lot, but I think editors and IDEs will recognize that this is a Ruby file by that extension and will be able to put nice colors on the various parts of the code.

print "Hello "
print "World\n"
puts "Hi"

puts Foo

I ran the code and got:

$ ruby examples/ruby/hello_world.rb

Hello World
Hi
examples/ruby/hello_world.rb:5:in `<main>': uninitialized constant Foo (NameError)

So calling print will print to the screen exactly what I gave to it. Calling puts will do the same, but it will also add a newline at the end of the output. If I want print to add a newline I can include \n in my string.

The last output is an error. It shows that we have to put the strings in quotes.

Comments

We can # after a statement or even as the first character of a line and everything to the right of this character (including this character) will be disregarded by Ruby. Very handy to add comments for the next person looking at the code.

Arithmetic

Part of programming is calculating stuff by using various arithmetic operations. Let's see how does that work in Ruby:

puts 19+23   # 42
puts 19-23   # -4
puts 2*3     # 6

puts 2**3    # 8

puts 8/2     # 4
puts 3/2     # 1
puts 3/2.0   # 1.5
puts 3.0/2   # 1.5
puts 2.0/3   # 0.6666666666666666
puts 4/2.0   # 2.0



puts 3/0     # examples/ruby_arithmetic.rb:7:in `/': divided by 0 (ZeroDivisionError)
             #      from examples/ruby_arithmetic.rb:7:in `<main>'

puts 'done'

(The output of each line was added as a comment on the same line.)

The basic operation such as +, -, * work as they should.

** is the exponent operation so 2**3 = 2*2*2 = 8

At first it seems that division also works as expected, 8/2 = 4 but then it turns out that 3/2 = 1.

Apparently if both numbers in a division are whole numbers, then Ruby will also return a whole number.

On the other hand, if either (or both) the numbers are floating point numbers (they have a decimal point in them) then the result will be also floating point. Even if the value after the decimal point is 0 (4/2.0 = 2.0).

Dividing by 0 generates an exception called ZeroDivisionError and stops the program. (We did not reach the puts 'done' statement.

Ruby strings

We saw arithmetic operations on numbers, but we can also use some operations on strings:

puts "Hello".length     # 5
puts "Hello".reverse    # olleH

puts "Jar " * 2         # Jar Jar
puts 2 * 3              # 6

puts "2".to_s * 3       # 222
puts 2.to_s * 3         # 222

puts 2 * "Jar " 
 #   examples/ruby_strings.rb:7:in `*': String can't be coerced into Fixnum (TypeError)
 #      from examples/ruby_strings.rb:7:in `<main>'

Becasue in Ruby everything is an object, there are certain methods we can run on a string. For example the length method will return the number of characters in the string.

The reverse method will return the characters in reverses order.

We can use the * multiplicator on a string. It will return (and puts will display) a new string in which we have several (in this case 2) copies of the original string.

If we want to have 3 copies of the number 2, we have a number of options.

puts 2 * 3 # 6

this one is not doint that, as the * between two numbers will multiply them.

We can put the number 2 in quotes:

puts "2".to_s * 3 # 222

or we can use the to_s to conver the number to a string.

puts 2.to_s * 3 # 222

This will be much more interesting once we start to use variables.

Finally, if we new that we want 3 copies of the number 2, we could have just written "222" without trying to use all kinds of Ruby operations.

Absolutely finally, let's see what happens if we swap the string and the number in this multiplication operation:

puts 2 * "Jar "

We get an excpetion:

    examples/ruby/strings.rb:7:in `*': String can't be coerced into Fixnum (TypeError)
       from examples/ruby/strings.rb:7:in `<main>'

I guess we cannot do that.

timestamp: 2015-02-08T15:30:01 tags:

  • print
  • puts
  • length
  • reverse
  • to_s

Variables and Variable Interpolation in Ruby

Using simple values as in introduction to Ruby article will get you bored soon. It is much more interesting to create names that can hold values and then use those names. These names we call "variables", because usually their content can be changes.

Variable nanes can starte with any lower~ or uppercase letter or underscore. They can contain lowercase, uppercase letters, digits, and underscores. Once we have assigned value to a variable we can use puts to print the content of the variable

We can also use the variables for other operations. For example to add them together:

a = 23
b = 19

c = a + b
puts c

$ ruby add_numbers.rb
42

With this we can create a variable containing the name of the user, and then welcome that user by concatenating together two strings and the variable using + as string addition.

name = "Foo Bar"

puts "Hello " + name + ", how are you?"
$ ruby hello_name.rb
Hello Foo Bar, how are you?

Interpolation or variable embedding

There is however another way to create the same result:

name = "Foo Bar"

puts "Hello #{name} how are you?"

Here we used the #{} construct to tell ruby, instead of the word 'name' it needs to take the content of the variable called 'name' and include that in the string.

Variable declaration in Ruby

In Ruby there is no need to declare a variable. It just has to appear in an assignment before it is used in any other expression.


x = 23
puts x

puts y
y = 19
$ ruby bad_variable.rb
23
bad_variable.rb:5:in `<main>': undefined local variable or method `y' for main:Object (NameError)

timestamp: 2015-10-06T10:50:01 tags:

  • puts
  • #{}

Arrays in Ruby

Arrays in Ruby are similar to arrays in Perl and lists in Python. They can hold a bunch of unrelated values in slots indexed from 0 up.

Similar to the other two dynamic languages, arrays in ruby can also grow and shrink as the user needs them. There is no need for special memory handling.

Arrays in Ruby are created by comma separated values assigned to a variable:

names = 'Foo', 'Bar', 'Baz'

We can access them using an index after the name of the variable:

puts names[0]    # Foo
puts names[1]    # Bar
puts names[2]    # Baz

We can fetch the size of the array, by using the length method:

puts names.length   # 3

If we try to access an element by an index that is not in the array, Ruby will print nothing (an empty sting):

puts names[3]       # (nothing)

On the other hand, just like in Perl, Ruby arrays understand negative indexes. They access the array from the other end:

puts names[-1]      # Baz

Assign to array element

We can assign a value to any of the indexes in the array. It overwrites the old value. Then we can fetch the current value from the array.

names[1] = 'Happy'
puts names[1]     # Happy

Not only that, but we can also assign to indexes that were not part of the array previously. The array will automatically grow as we can see from the value returned by the length method.

names[3] = 'Moo'
puts names[3]       # Moo
puts names.length   # 4

We can even assign value to an index further away. The array will be enlarged and the intermediate elements will remain empty. (They will have nil in them.)

names[6] = 'Far Away'
puts names[6]       # Far Away
puts names.length   # 7
puts names[5]       # (nothing)

Going over the elements of the array

There are a number of ways to iterate over the elements of an array.

Pretty Print values for debugging

Similar to the Data::Dumper module in Perl, Ruby has the pp library for Pretty Printing data structures. It makes it easy to print out the content of an array:

require 'pp'
pp names        # ["Foo", "Happy", "Baz", "Moo", nil, nil, "Far Away"]

push

If we want to add one or more elements to an array, we can use the push method to do that.

names.push 'Hello', 'World'
pp names     # ["Foo", "Happy", "Baz", "Moo", nil, nil, "Far Away", "Hello", "World"]

pop

The opposite operation is called pop. It will fetch the last element of an array, remove it from the array and return it to be used in an assignment:

last = names.pop
pp names    # ["Foo", "Happy", "Baz", "Moo", nil, nil, "Far Away", "Hello"]
puts last   # World
pp last     # "World"

We can even pass a parameter to pop to indicate how many element we wish to remove from the end of the array. In that case (even if we passed 1), the returned value will be an array of the removed elements:

last = names.pop 2
pp names    # ["Foo", "Happy", "Baz", "Moo", nil, nil]
pp last     # ["Far Away", "Hello"]

shift

shift moves the content of the array to the left. The left-most element is removed from the array and returned to be used in an assignment (or any other operation). It can be thought as similar to pop just at the beginning of the array.

first = names.shift
pp names    # ["Happy", "Baz", "Moo", nil, nil]
puts first  # Foo

unshift

unshift is the opposite of shift. It puts one or more elements to the beginning of the array. This methods is rarely used.

names.unshift 'Zorg', 'Morg'
pp names    # ["Zorg", "Morg", "Happy", "Baz", "Moo", nil, nil]

Full example


names = 'Foo', 'Bar', 'Baz'

puts names[0]   # Foo
puts names[1]   # Bar
puts names[2]   # Baz

puts names.length   # 3

puts names[3]       # (nothing)


puts names[-1]      # Baz

names[1] = 'Happy'
puts names[1]       # Happy


names[3] = 'Moo'
puts names[3]       # Moo
puts names.length   # 4

names[6] = 'Far Away'
puts names[6]       # Far Away
puts names.length   # 7
puts names[5]       # (nothing)

require 'pp'
pp names     # ["Foo", "Happy", "Baz", "Moo", nil, nil, "Far Away"]

names.push 'Hello', 'World'
pp names     # ["Foo", "Happy", "Baz", "Moo", nil, nil, "Far Away", "Hello", "World"]

last = names.pop
pp names    # ["Foo", "Happy", "Baz", "Moo", nil, nil, "Far Away", "Hello"]
puts last   # World
pp last     # "World"


last = names.pop 2
pp names    # ["Foo", "Happy", "Baz", "Moo", nil, nil]
pp last     # ["Far Away", "Hello"]


first = names.shift
pp names    # ["Happy", "Baz", "Moo", nil, nil]
puts first  # Foo


names.unshift 'Zorg', 'Morg'
pp names    # ["Zorg", "Morg", "Happy", "Baz", "Moo", nil, nil]

timestamp: 2015-10-29T14:31:23 tags:

  • []
  • array
  • push
  • pop
  • shift
  • unshift

For loop in Ruby (iterating over array elements)

In Ruby the C-like for-loop is not in use. Instead of that people usually iterate over the elements of an array using the each method.


names = ['Foo', 'Bar', 'Baz']

puts names
puts


names.each { |item|
    puts item
}
puts

names.each do |item|
    puts item
end

In this example we have an array with 3 elements. At first we just printed the array as it is. We got the values on on every line. That can be useful for debugging, but it we want to go over the elements we need some kind of a loop.

The each method allows us to iterate over the elements of the array. On every iteration the variable between the pipes (item in our case) will receive the current value.

Here we have 2 examples. The first one uses curly braces to mark the beginning and the end of the block, the other one uses the do - end pair.

$ ruby examples/ruby/iterating_on_array.rb 
Foo
Bar
Baz

Foo
Bar
Baz

Foo
Bar
Baz

There is another way to iterate over the elements, but it can have a nasty side-effect and thus not recommended.

names = ['Foo', 'Bar', 'Baz']

for item in names 
    puts item
end

the code looks ok, the result is ok

$ ruby examples/ruby/for_loop_on_array.rb 
Foo
Bar
Baz

But if we have used the item variable earlier, then this for loop will overwrite that other item variable with the last value seen in the loop.

names = ['Foo', 'Bar', 'Baz']

item = 'Moose'

for item in names 
    puts item
end

puts
puts item


Note how, after the loop has finished the variable item holds 'Baz'.

$ ruby examples/ruby/for_loop_on_array_global.rb 
Foo
Bar
Baz

Baz

timestamp: 2015-02-10T13:00:01 tags:

  • each
  • for

Range in Ruby

Ruby has two operators to generate a range of values. .. is inclusive and ... is exclusive.

..

for i in 0..3
    puts i
end

Will generate

0
1
2
3

including the beginning and the end similar to how the range in Perl works.

...

If we use 3 dots instead of two, then the range will include the lower limit, but not the higher limit. Similar to how range in Python works.

for i in 0...3
    puts i
end

0
1
2

reverse range

If the limit on the left hand side is higher than on the right hand side, the range operator won't return any values.

for i in 7 .. 4
    puts i
end

It does not return any value.

As an alternative we can create a growing list of number and then call the reverse method on them. For this however first we need to convert the rnage to an array:

for i in (4..7).to_a.reverse
    puts i
end

printing:

7
6
5
4

Range of letters

In additonal to ceating ranges of numbers, Ruby can also create a range of letters:

for i in 'a'..'d'
   puts i
end
a
b
c
d

Range of characters

Not only that, but we can use any two characters in the visible part of the ASCII table:

for i in 'Z'..'a'
   puts i
end
Z
[
\
]
^
_
`
a

Range with variables

We can also use variables as the lower and upper limits:

x = 3
y = 6 
for i in x .. y
   puts i
end

timestamp: 2015-09-24T23:30:01 tags:

  • ..
  • ...
  • to_a
  • reverse

ARGV - the command line arguments of a Ruby program

When you run a script written in Ruby you can put all kinds of values on the command line after the name of the script:

For example:

ruby code.rb abc.txt def.txt qqrq.txt

or like this:

ruby code.rb Hello --machine big -d -tl

The question though, how can the Ruby program know what was given on the command line?

Ruby maintains an array called ARGV with the values passed on the command line. We can access the elements of this array, just as any other array:

ARGV[0] is going to be the first value after the name of the script.

We can iterate over the elements either directly with a for loop:

for arg in ARGV
   puts arg
end

or iterating over the range of indexes, and accessing the elements using that index.

for i in 0 ... ARGV.length
   puts "#{i} #{ARGV[i]}"
end

$ ruby command_line_argv_with_index.rb foo bar --machine big
0 foo
1 bar
2 --machine
3 big

Verifying the number of arguments

For a simple input validation we can check the length of the ARGV array. Report if we have not received enough arguments and exit the program early.

if ARGV.length < 2
  puts "Too few arguments"
  exit
end

puts "Working on #{ARGV}";

Running this script we get:

$ ruby command_line_argv_check_length.rb one
Too few arguments

$ ruby command_line_argv_check_length.rb one two
Working on ["one", "two"]

Values received on the command line are strings

In this snippet of code we first check if we got exactly 2 parameters and we do, we add them together:

if ARGV.length != 2
  puts "We need exactly two arguments"
  exit
end

puts ARGV[0] + ARGV[1]
ruby command_line_argv_add.rb 23 19
2319

The result might not be surprising to you if you know that the values the user passes on the command line are received as strings. Eeven if they are actually numbers. If we would like to use them as number we have to convert them using to_i:

if ARGV.length != 2
  puts "We need exactly two arguments"
  exit
end

puts ARGV[0].to_i + ARGV[1].to_i

$ ruby command_line_argv_add_numbers.rb 23 19
42

timestamp: 2015-10-06T22:30:01 tags:

  • ARGV
  • to_i

Open file and read content in Ruby

Reading the content of a file is one of the most important tasks in any programming language. In Ruby it is very easy to do so.

if ARGV.length != 1
    puts "We need exactly one parameter. The name of a file."
    exit;
end

filename = ARGV[0]
puts "Going to open '#{filename}'"

fh = open filename

# puts fh

content = fh.read

fh.close

puts content

ruby read_file.rb  some/file.txt

This program expects us to supply the name of a file on the the command line and then uses ARGV to access this value. Actually at first we check if the user has supplied the correct number of parameters and exit the program if not.

Then we use the open call top open the file for reading. This will return an instnce of the the File class.

(If we printed out the content of fh we would get something like #&lt;File:0x007f8c0310f748&gt;

The read method of this class will read in the content of the file and we have assigned it to the varibale content. This is the whole content of the file, including the embedded newlines.

This is basicallt the same as the slurp mode in Perl

Read file line-by-line

Reading in the whole file in one operation as seen above is easy, but not necessarily a good idea. For example if you have a 10 Gb log file, you probably would not want to read the whole thing into memory. Do you even have 10 Gb free memory?

In such cases it is probably better to read the file line-by-line. After reading each line, do whatever processing you need to do on that line and then, replace it with the next line. This way at any given time we only hold one line in memory. Much more efficient.

There are actually at least two ways to write this and I am not sure if there is any advantage/disadvantage in either of them.

This first one using gets and while looks more similar to what someone coming form Perl 5 would write:

if ARGV.length != 1
    puts "We need exactly one parameter. The name of a file."
    exit;
end

filename = ARGV[0]
puts "Going to open '#{filename}'"

fh = open filename

while (line = fh.gets) 
   puts line
end

fh.close

The other one using each looks more Rubyish:

if ARGV.length != 1
    puts "We need exactly one parameter. The name of a file."
    exit;
end

filename = ARGV[0]
puts "Going to open '#{filename}'"

fh = open filename

fh.each do |line|
   puts line
end

fh.close

timestamp: 2015-10-11T12:30:01 tags:

  • open
  • File
  • read
  • readline
  • gets
  • while
  • each

Download an HTML page using Ruby

While a page on a web-site is totally different from a file, several languages provide a way to read them as if they were regular files. I am not sure if this is a good idea, but it certainly works for some people.

In Ruby, the open-uri modules provides this simplified interface.

After loading the module with require it overrides the standard open function so from now on, in addition to opening regular files, it will be able to 'open' URLs as well. Of course, it can only open them read-only as we can only fetch pages cannot push them out, but it can get all kinds of additional parameters.

require 'open-uri'
url = 'http://code-maven.com/'
fh = open(url)
html = fh.read
puts html

The open in such cases will return an instance of StringIO. If we were printing out the contet of fh we would get:

puts fh   #  #&lt;StringIO:0x007fc41c8bc238&gt;

Once we get the object we can apply the same methods as on a regular filehandle. For example we can use the read method to read in the content of the whole page. As opposed to the case when we read regular files,, in this case there is no efficiency reason to read the content line-by-line. The way HTTP works it does not make much sense. By the time we start reading the page the whole document have arrived and is located in the memory of our program. We can as well copy it to our internal variable using the read method.

Lie about who are we

When a browser accesses a web site it tells the site what kind of browser is that, which version etc. The same happens when we "open" a web page using the open function supplied by the open-uri module.

By default, opern-uri calls it 'browser' Ruby which does not say much. We can change it to whatever we want by passing "User-Agent" to the open call:

require 'open-uri'
url = 'http://code-maven.com/'
fh = open(url, 
   "User-Agent" => "Code-Maven-Example (see: http://code-maven.com/download-a-page-using-ruby )"
)
html = fh.read
puts html

This string will be written in the Access log of the web server we connect to.

timestamp: 2015-10-11T16:30:01 tags:

  • open-uri
  • User-Agent

Basic data structures in Ruby (Scalar, Array, Hash)

In Ruby there are 3 basic data structures.

Scalars can hold a single value: a number or string.

Arrays is an ordered list of scalars.

Hashes are key-value pairs where the keys are uniques strings and the values are scalars

The class method can tell us what kind of value a variable contains:


x = 43
puts x.class      # Fixnum

q = 3.14
puts q.class      # Float

z = "abc"
puts z.class      # String

colors = [ 'Blue', 'Green', 'Yellow' ]
puts colors.class # Array

person = { 'fname' => 'Foo', 'lname' => 'Bar' }
puts person.class  # Hash


timestamp: 2015-10-12T12:01:01 tags:

  • class
  • Fixnum
  • Float
  • String
  • Array
  • Hash

Reading CSV file in Ruby

Ruby comes with a standard library called CSV to make it easy to read files with Comman Separated values

CSV file

In this CSV file the 3rd fields in every "row" is a number. We would like to sum these numbers.

Budapest,Bukarest,1200,km
Tel Aviv,Beirut,500,km
London,Dublin,300,km
New York,"Moscow, East",6000,km
Local,"Remote
Location",10,km

If it was a simpler file We could read it line-by-line and use split to cut it into parts, but in this file there is a field that has a comman in it. Plain split would not be able to handle the field enclosed in quote marks " containing a comma.

This file also has a field with an embedded newline. So the physical rows of the file, marked by \n newlines are not the same as the logical lines that and good CSV parser would understand.

require "csv"


filename = File.dirname(File.dirname(File.expand_path(__FILE__))) + '/data/distance.csv'
sum = 0
CSV.foreach(filename) do |row|
  sum += row[2].to_i
end

puts sum

In this example first we load the CVS module then we use the CVS.foreach(filename) construct to iterate over the file loical row by logical row. On each iteration the variable row is going to be an array. The third element can be accessed using index 2. We have to convert the value to a number using to_i and then we can add it to the variable sum

timestamp: 2015-10-09T16:30:01 tags:

  • CSV

Analyze Apache log file - count localhost in Ruby

The exercise was, that give a log file created by the Apache web server (or for that matter by any web server), to count how many hits arrived from localhost (IP 127.0.0.1) and how many have arrived from any other place.

The log file look like this:

127.0.0.1 - - [10/Apr/2007:10:39:11 +0300] "GET / HTTP/1.1" 500 606 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:39:11 +0300] "GET /favicon.ico HTTP/1.1" 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET / HTTP/1.1" 500 612 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET /favicon.ico HTTP/1.1" 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:53:10 +0300] "GET / HTTP/1.1" 500 612 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET / HTTP/1.0" 200 3700 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET /style.css HTTP/1.1" 200 614 "http://machine.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET /img/machine-round.jpg HTTP/1.1" 200 17524 "http://machine.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.11 - - [10/Apr/2007:10:54:21 +0300] "GET /unix_sysadmin.html HTTP/1.1" 200 3880 "http://machine.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:54:51 +0300] "GET / HTTP/1.1" 200 34 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:54:51 +0300] "GET /favicon.ico HTTP/1.1" 200 11514 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:54:53 +0300] "GET /cgi/machine.pl HTTP/1.1" 500 617 "http://contact.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET / HTTP/0.9" 200 3700 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:58:27 +0300] "GET / HTTP/1.1" 200 3700 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:58:34 +0300] "GET /unix_sysadmin.html HTTP/1.1" 200 3880 "http://machine.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"
217.0.22.3 - - [10/Apr/2007:10:58:45 +0300] "GET /talks/Fundamentals/read-excel-file.html HTTP/1.1" 404 311 "http://machine.local/unix_sysadmin.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)"

The Algorithm:

We need two counters, one for counting the hits from 127.0.0.1 and one to count all the other hits. Then we need to go over all the lines. Extract the IP address, and based on its value to increment one of the counters.

We expect the program to be used as ruby apache_localhost.rb data/apache_access.log. That is, we expect the user to provide the name of the Apache log file as a parameter on the command line.

Therefore in the first few lines we check if the user has supplied any filename on the command line by checking the number of elements in ARGV. If the given number of parameters is not 1, then we tell the user how to use our program and exit.

Then we copy the name of the file from ARGV to an internal variable called filename. Mostly for readability of the code.

Then we create the two counters and initialize them to 0.

The we open the file for reading and read it line-by-line using each. On each iteration the variable line will hold the current line form the file.

The IP address is the first value in every row up till the first space. There are a number of ways to extract it from the line. In this case we used the index method on the line passing a space to it. That will return the location of the first space in the line. Because the string indexing starts with 0, this number will be also the length of the IP address. Hence we have assigned the result to a variable called length.

We can use this variable then to extract the substring from the line that starts on character 0 and includes length characters. We can do that by providing the index of the beginning of the substring and the length of the substring we would like to extract. That will be the IP address from the current line.

What remains is to check if this equals to 127.0.0.1 (localhost) and increment the appropriate counter.


if ARGV.length != 1 then
    puts "We need the name of the log file"
    exit
end

filename = ARGV[0]

local = 0
remote = 0

fh = open filename
fh.each do |line|
    length = line.index(' ')
    ip = line[0, length]
    if ip == '127.0.0.1' then
        local = local+1
    else
        remote = remote+1
    end
end

puts "Number of remote requests is #{remote}. Number of local requests was #{local}"

timestamp: 2015-10-27T08:30:01 tags:

  • index
  • substr
  • open
  • ARGV
  • []

Pretty printing complex data structures in Ruby - like Data::Dumper in Perl

Creating complex data structures (hashes of hashes of arrays of hashes of arrays....) is easy in Ruby.

Unedrstanding what we have on our hand can be a bit more difficult.

Perl has a build in modules called Data::Dumper that can be used to pirnt complex data structures, for example when debugging Perl programs.

Ruby has pp, the Pretty Printer for the same purpose.

timestamp: 2015-10-22T07:30:01 tags:

  • pp

The 14 most important Ruby Resources

The most important resources for Ruby programmers.

ruby-lang.org is the main web site of Ruby. ruby-doc.com holds the docmentaiton of the Ruby programming language and its standard libraries.

Gems are 3rd-party libraries for Ruby. rubygems.org hold the information about them.

www.rubydoc.info holds documentation of Gems, the 3rd party Ruby libraries.

Learning Ruby

learnrubythehardway.org an online book to learn Ruby.

rubylearning.com

tryruby a web-based editor to try the basics of Ruby.

Screencasts

railscasts.com holds lot of screencasts about Ruby on Rails. Many can be watched free of charge. For some you need to pay. The production of new episodes has stopped in June 2013.

rubytapas.com a subscription only series of screencasts (2 every week) about programming in Ruby.

Web frameworks

Sinatra is a light-weight route-based web framework for Ruby.

Ruby on Rails

Ruby on Rails (aka ROR) is the most popular web framework for Ruby.

www.railstutorial.org is a tutorial for Ruby on Rails.

Ruby news

Ruby Weekly newsletter a weekly e-mail newsletter about Ruby.

rubyinside.com Ruby news on the web. (Stopped in October 2014.)

Other sites

timestamp: 2015-10-22T05:30:01

Solution: Sum of numbers in a file implemented in Ruby

The exercise was to take a file where each line contains a number and display the sum of the numbers in the file.

We have the name of the file in a variable called filename and we create anothe variable called sum where we are going to hold the sum of the numbers. We initialize it to 0.

Then we open the file, by default, for reading and get back the filehandle.

Using an each loop we can iterate over the lines of the file. In each iteration the content of the current line is in the variable called line. Even though we expect it to be a number, what Ruby read in is kept as String. We need to convert it into a number, an integer number in this case, using the to_i method. We can then add the number to the content of sum using the += operator.

After the loop is finished, when we don't have anything else to do with the file, we should call the close method to close the file.

At the end we print out the content of sum


filename = '../data/numbers.txt'

sum = 0
fh = open filename
fh.each do |line|
   sum += line.to_i
end
fh.close

puts "The total is #{sum}"

Sum of numbers using a one-liner

There is another solution, using some more advanced techniques in Ruby:

Here the whole expression computing the sum is embedded ina string. The final result of this expression will be included in the print.


filename = '../data/numbers.txt'

puts "The total is #{open(filename).readlines.map(&:to_i).reduce :+}"

Inside the string we have the following expression:

open(filename).readlines.map(&:to_i).reduce :+

Let's take that apart. The first statement is open(filename). It opens the file for reading and returns a filehandle.

Instead of assigning it to variable though. we immediately use this filehandle and call the readlines method of it.

the readlines method will return a an array of the lines. Each line in the original file is an element in the returned array.

The map method will go over each element of the array and call the given function on each element. Specifically it will call to_i on each element of the array converting each element into a number. So the map method will already return an array of numbers.

In the last section we call the reduce method. It reduces the array to a single element by executing the expression + on every two element. More specifically it takes the first two calues and executes the statement (+) on these two. Then it takes the resul of this and the next element (the third element) and executes it again adding them together. Then it takes the result of this and the next element (the 4th). etc. Effectively it means putting + between all the elements and calculating their sum.

filename = '../data/numbers.txt'

fh = open(filename)
lines = fh.readlines
numbers = lines.map(&:to_i)
sum = numbers.reduce :+

puts "The total is #{sum}"


timestamp: 2015-10-20T16:00:01 tags:

  • open
  • each
  • to_i
  • map
  • reduce

Solution: Number guessing game in Ruby

In this exercise, you were asked to create the first version of the Number guessing game. We are going to see a solution in Ruby.


hidden = rand(200)
# puts hidden

print "Type in your guess: "
guess = gets.to_i

if guess == hidden
    puts "Hit"
elsif guess < hidden
    puts "Your guess is smaller that the hidden number"
else
    puts "Your guess is bigger that the hidden number"
end

We use the rand function to generate a random number. If we called it without any parameter, rand would return a floating point number between 0 and 1. As we passed a whole number to it, the rand function generated a whole number between 0 and 200, the number we gave.

There is a commented out line printing the hidden value back. I used it to be able to test the code which was comparing my input to the hidden number.

Then we use the print function to ask the user to type in a number. We use the print functions instead of the puts function to avoid adding a trailing newline. So the cursor will appear on the same line where the request was printed.

gets reads in whatever we type in response. to_i converts it to integer. (By default when we read in something from the standard input, that will be a string. Even if it only contains numbers. We have to tell Ruby to convert it to an integer.)

Then we have 3 cases depending the relative values of the hidden and guess variables.

timestamp: 2015-10-18T11:00:01 tags:

  • exercises
  • rand
  • gets
  • print
  • puts
  • if
  • elsif
  • else

Count web server hits using Ruby

The exercise was, that given a log file created by the Apache web server (or by any web server), to count how many hits arrived from each individual IP address.

Earlier we saw a solution in which we counted the number of hits to locl host vs any other address. That program already had the part that read the file line-by-line and extracted the IP address.

This time we just need to change the method of counting. Instead of having two counters, one for the local and one for the remote IPs, we need to create an unknown number of counters. We could do that using two arrays, but that would be complicated and slow.

Instead of that we use a hash. (Called a dictionary in Python and also called hash in Perl.) In that hash the keys are going to be the IPs and the values are going to be the number of time the specific IP was seen.

A hash can start out empty and grow dynamically.

So we created an empty hash called count. We open the file, iterate over it line-by-line. Locate the first space that's just after the IP address and use the returned value to fetch the IP address. Now that we have the current IP address in a variable we can check if this is the first time we see it, by asking if it the value of it is "true". If it is we increment it by 1. If this is the first time we see the word we add it as a key to the hash and initialize its value to be 1.

That's the whole story.

In the last 3 lines we iterate over all the key-value pairs in the hash and print the values.


if ARGV.length != 1 then
    puts "We need the name of the log file"
    exit
end

filename = ARGV[0]

count = {}

fh = open filename
fh.each do |line|
    length = line.index(' ')
    ip = line[0, length]
    if count[ip]
       count[ip] += 1
    else
       count[ip] = 1
    end
end

count.each do |k, v|
    puts "#{k}    #{v}"
end

timestamp: 2015-10-30T13:30:01 tags:

  • index
  • substr
  • open
  • ARGV
  • []

split in Ruby

String objects in Ruby have a method called split. It is similar to the split function of Perl. It can cut up a string into pieces along a pre-defined string or regex returning an array of smaller strings.

In the first example you can see how to split a string every place where there is a comma ,:

require 'pp'

words_str = 'Foo,Bar,Baz'
words_arr = words_str.split(',')
pp words_arr   # ["Foo", "Bar", "Baz"]

In the second example we use a Regex to match the places where we would like to cut up the string. It makes the splitting much more flexible:

require 'pp'

words_str = 'One   -  Two-  Three'
words_arr = words_str.split(/\s*-\s*/)   # ["One", "Two", "Three"]
pp words_arr

Split with limit

We can pass a second parameter to split that will limit the number of reurned valus. If we pass 3, then split will make two cuts and return the results:

require 'pp'

words_str = 'Foo,Bar,Baz,Moo,Zorg'
words_arr = words_str.split(',', 3)
pp words_arr   # ["Foo", "Bar", "Baz,Moo,Zorg"]

Split by empty string

As a slightly special case, if we use an empty string (or empty regex) to split with, then we will get back an array of the individual characters:

require 'pp'

words_str = 'Foo,Bar,Baz'
words_arr = words_str.split('')
pp words_arr   # ["F", "o", "o", ",", "B", "a", "r", ",", "B", "a", "z"]

timestamp: 2015-10-31T12:00:01

The conditional operator in Ruby

The official name of the ? : operator is the conditional operator, though most people know it as the ternary operator indicating the number of operands it has.

There are several unary operators that handle a single operand. For example - can be unary operator.

Most of the operators ad binary operators that handle two operands. For example * always needs two operands to work on, but in most cases - is also used as a binary operatory.

There is only one ternary operator, that has 3 operands. It is called the conditional operator, but because it is the only one with 3 operands, most of the people call it the ternary operator.

Conditional Operator in Ruby

In general it looks like this:

CONDITION ? EVALUATE_IF_CONDITION_WAS_TRUE : EVALUATE_IF_CONDITION_WAS_FALSE

It evaluates the CONDITION. If it is true then the code evaluates the part between ? and : and returns the result. If the CONDITION is false, then the middle part is skipped and the 3rd part is evaluated and the result of that expression is returned.

Example puts

In this example the return value of the conditional operator is passed to puts

filename = ARGV.shift
puts filename ? filename : 'No file given'

Example smaller

In this example we check whihc one of the two random values is smaller and return that one:


x = rand()
y = rand()
puts x
puts y
 
smaller = x < y ? x : y;
puts smaller

timestamp: 2015-10-31T14:23:23 tags:

  • ?:

Convert String to Number in Ruby

When reading data from a file or from other external resources, they always arrive in Ruby as String objects.

If we would like to use them as numbers we first need to convert them to numbers.

But which number and how?

The String objects in Ruby have several methods to convert the string object into a number.

  • to_i will convert the String to an Integer.
  • to_f will convert the String to an Float, a floating pont
  • to_r will convert the String to a Rational number.
  • to_c will convert the String to a Complex number.

Concatenation

Given two numerical values that are actually String object (because of the quotation marks around them), if we use the + operator it will work as concatenation.

a = "2"
b = "3"
puts a+b  # 23

no implicit conversion of Fixnum into String (TypeError)

If one of the values is a String object and the other one is a real number (no quotes), and we try to add them together as in the next example:

puts "2"+3 

We'll get an exception: no implicit conversion of Fixnum into String (TypeError)

String tha looks like an integer

A String that holds an integer can be converted to an Integer, a Float, a Rational number, or a Complex number:

puts a.to_i # 2
puts a.to_f # 2.0
puts a.to_r # 2/1
puts a.to_c # 2+0i

Setting the base: Converting binary, octal, hexadecimal to decimal

Normally to_i assumes that our original number in the String object is in 10-base representation, but if you would like to change that?

What if you'd like to treat the String as a binary number, an octal number, or a hexadecimal number? You just pass base= with the appropriate number to the to_i method:

puts "11".to_i            # 11
puts "11".to_i(base=2)    # 3
puts "11".to_i(base=16)   # 17

Of course hexadecimal "numbers" can also contain the letters a-f. The to_i fuction can deal with that too.

puts "aB".to_i(base=16)   # 171

Which brings up the question, what will happen if we use to_i without base on a string with hexadecimal number in it, or just any base with values that are not part of the 'digits' it can handle? They all silently return 0.

puts "aB".to_i            # 0
puts "9".to_i(base=8)     # 0

That leads us to the question: What happens if not all the characters ar convertable to number?: The answer is simple. to_i will convert all the 'digits' on from the beginning of the string up to the point where it does not understand any more. Then it will abandon the rest of the string. Even if there are more understandable digits later on.

puts "2x3".to_i           # 2
puts "2 3".to_i           # 2

Converting to Floating point and Rational number

We can use the other methods to convert a string to a Floating point, a Rational number, or even a Complex number: Some of thoes will understand a decimal point in the String

c = "14.6"
puts c.to_i    # 14
puts c.to_f    # 14.6
puts c.to_r    # 73/5
puts c.to_c    # 14.6+0i

Some of them will even understand the letter e marking exponent in the String:

e = "2.3e4x5"
puts e         # 2.3e4x5
puts e.to_i    # 2
puts e.to_f    # 23000.0
puts e.to_r    # 23000/1
puts e.to_c    # 23000.0+0i

Full example


a = "2"
b = "3"
puts a+b  # 23
puts '-------'

# puts "2"+3   # no implicit conversion of Fixnum into String (TypeError)

puts a.to_i # 2
puts a.to_f # 2.0
puts a.to_r # 2/1
puts a.to_c # 2+0i
puts '-------'

puts "11".to_i            # 11
puts "11".to_i(base=2)    # 3
puts "11".to_i(base=8)    # 9
puts "11".to_i(base=16)   # 17
puts '-------'

puts "aB".to_i(base=16)   # 171
puts "aB".to_i            # 0
puts "9".to_i(base=8)     # 0
puts '-------'

puts "2x3".to_i           # 2
puts "2 3".to_i           # 2
puts '-------'

c = "14.6"
puts c.to_i    # 14
puts c.to_f    # 14.6
puts c.to_r    # 73/5
puts c.to_c    # 14.6+0i
puts '-------'


e = "2.3e4x5"
puts e         # 2.3e4x5
puts e.to_i    # 2
puts e.to_f    # 23000.0
puts e.to_r    # 23000/1
puts e.to_c    # 23000.0+0i
puts '-------'


Comments

foo.to_i" works for simple cases. One might consider: "Integer(foo)".

"1.4.5".to_i => 1 Integer("1.4.5") => invalid value for Integer()

Latter is better suited as a result.


Hi, in this case my string is "12345.678" when use to_f then result is: 12345.67 (only 2 number after dot) How can i get enough 12345.678 . I mean how can i get more number after dot. Thanks guys

timestamp: 2015-11-02T15:00:01 tags:

  • String
  • to_i
  • to_f
  • to_r
  • base

Count digits in Ruby

In this exercise we had to count digits in a file.

The algorightm

We need to go over the file line-by-line. For each line then we need to go over it character-by-character. If it is a digit, we need to increment the appropriate counter. We could use 10 different variables (c0, c1, ... c9) to count the number of occurances of the respective digits, but there is a much more comfortable solution using arrays. Specifically we use an array called count with 10 elements in it. In count[0] we will count the number 0s, in count[1] we will count the number of 1s, ... up till count[9] in which we will count the number of 9s.

The solution in Ruby

In this solution written in Ruby, we start by getting the name of the file to be process on the command line.

At first we check if ARGV has the right number of arguments. If not we print an error message and exit.

Then we declare the count array.

Get the name of the file from the first element of the ARGV array containing the values from the command line and open it for reading and iterate over it using each.

In each iteration line will contain a single line. We need to go over it character-by-character. There are a number of ways to do this. In this solution I've decided to split the string into individual characters and then to iterate over the characters.

Using the split method with an empty string we cut up the string at every place where the empty string matches. Meaning we split up the string between any two charctes. The result is a list of charcters assigned to the chars array.

Then we iterate over the elements of this array. In each iteration c is a single character from the source file. Based on our assumption, it is either a digit (0-9) or space.

If it is not a space if c != ' ' then we would like to increment the appropriate counter.

If this is our first encounter with the particular digit we need to initialize the array element that will count that digit.

if not count[c.to_i] then
    count[c.to_i] = 0
end

Then in every case we increment it by one using the following expression:

count[c.to_i] += 1

At the end of the loop we have counted all the digits. We need to display the results.

For that we iterate over all the numbers between 0 and 9 and print out the number, and the counter of that number.

We could have printed the content of `count[i]</hl< directly, but then we would have holes for the digits that don't appear in our source. (In this case 4 and 6). That would not look right. Any observer would think there is a bug in the code. It is much better to print 0 if the digit never appeared. We do that by using the conditional operator.

if ARGV.length < 1
    puts "Missing filename from the command line"
    exit;
end

count = []

filename = ARGV[0]
fh = open filename
fh.each do |line|
    chars = line.split('')
    chars.each do |c|
        if c != ' ' then
            if not count[c.to_i] then
                count[c.to_i] = 0
            end
            count[c.to_i] += 1
        end
    end
end

(0..9).each do |i|
    print i, '    ', ( count[i] ? count[i] : 0), "\n"
end


timestamp: 2015-11-03T22:23:23 tags:

  • open
  • split
  • +=

Hello World using CGI in Ruby

The Hello World exercise is usually the first thing people do in every language and in every environment. Here we are going to do it using Ruby and the cgi module that comes with Ruby.

Hello World in plain Ruby

Writing a Hello World! script in Ruby that can run as a CGI program is quite simple. You need to make sure that your web server is set up to handle CGI requests, and then you need to create a simple script:

#!/usr/bin/ruby

print "Content-type: text/html\n\n"
print "Hello World!\n"

It has a few elements that other, command line script, might not necessarily need.

The first line, also called the hashbang line needs to point to the Ruby interpreter/compiler. in our case this means the first line is #!/usr/bin/ruby. Please note, there are no spaces in that line and it must be the first line in the file. It is used by Apache and the Unix/Linux environment to determine which interpreter will understand the code in this file.

Then, before we print out any of the real HTML content, we need to send out the HTML header followed by an empty row. At least we need to print out the Content-type. Therefore the next line is printing the content-type followed by two newlines. The first is the end of the Content-type line, the second is the end of the empty row.

Then can come the actual content of the page. In our case this is a simple string that reads Hello World!

We need to place the file in a directory that was configured to handle CGI script and we need to make the file executable by running

$ sudo chmod +x web_hello_world.rb

Once we have this we can use our regular browser, or curl to access the appropriate URL.

Hello World in Ruby using the cgi module

While in this simple case we could get by without any module, I think it is important to see how the "Hello World" script can be written using the cgi module that comes with ruby.

#!/usr/bin/ruby

require 'cgi'
cgi = CGI.new
print cgi.header
 
print "Hello World!\n"

The difference is that in this case we use the header method of the cgi instance to create the Content-type line. There is nothing special here, except maybe that you don't need to remember the content type of an HTML page.

timestamp: 2015-11-12T14:00:01 tags:

  • cgi

Ruby ENV - access the environment variables

Ruby maintaines a hash called ENV that gives us access to the envrionment variables such as PATH or HOME.

We can see them all using pp, the pretty printer of Ruby.

require 'pp'

pp ENV

We can also access the value directly. For example: puts ENV['PATH'], we can add new environment variables or change existing ones with one big caveat. Once our Ruby program ends these changes will be gone.

If we start a new process from our Ruby program after we have made modifications ENV, all those modifications will be seen by the other process. However the changes cannot propagate to the process that launched our Ruby program:

For example if run this program:

system("echo $PATH")
ENV['PATH'] = '/nothing/here'
system("echo $PATH")

The output will look something like this:

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
/nothing/here

The first time we called system the new shell saw the original content of the PATH environment variable. Then we changed it and set it to something horribly bad. When we called system the second time the new shell saw the new value.

After tunning the above script execute the following in the Unix/Linux shell

echo $PATH

It will print the same path as it did with the first call to system.

/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

This means our changes in the Ruby code have not changed the environment variable for the parent process.

This is a feature of most or all of the Operating Systems.

timestamp: 2015-11-18T15:50:01 tags:

  • ENV
  • system

How to write to file in Ruby

Writing to a file is quite simple in Ruby. The File library helps us in it.

Write to file

f = File.new('out.txt', 'w')
f.write("Hello World!\n")
f.write("Hello Foo!\n")
f.close     

If we run this it will create the file out.txt if it did not exists before or it will overwrite it if it existed earlier. Any previous content will be removed.

Append to file

If the second parameter to the new method is 'a' and not 'w' then we are goint to append to the end of the file. This mean is the file already has some content, it will be kept and anything new will be added to the end. If the file did not exist earlier then this too will create it.

f = File.new('out.txt', 'a')
f.write("Hello World!\n")
f.write("Hello Foo!\n")
f.close     

timestamp: 2015-11-21T08:30:01 tags:

  • File
  • write

Logical operators in Ruby (and, or, not), (&&, ||, !)

In Ruby there are two sets of logical operators:

and, or, not

&&, ||, !

Normally you can use either set of the operators but it is not recommended to mix them in the same expression. The difference between the two sets is the precedence. The operators that are words (and, or, not) are lower in the operator precedence table than the other three.

Logical operators are used in a conditional expression, for example in an if statement or in the Ternary operatory, we would like to combine 2 ore more conditions.

For example we would like to check if the age of the user is bigger that 18, but smaller than 23.

We can write:

18 < age and age < 23

The same can also be written as

18 < age && age < 23

There is no real difference between the two expressions. The && are there mostly for historical reasons as they were used in many programming languages before Ruby. and is usually much more readable.

if ARGV.length < 1
   puts "Needs an argument for age"
   exit
end

age = ARGV[0].to_f

if 18 < age and age < 23
    puts "In the range"
end


if 18 < age && age < 23
    puts "In the range"
end

timestamp: 2019-04-16T16:44:01 tags:

  • and
  • &&