Reading files
(Note: you can find all the files you need for this class in the "Files" button at the bottom of the page.)
There's a lot of data out there. You can write programs to do stuff with this data. But how? If you are lucky, the data is in a standard format. One simple format is called CSV, which stands for comma-separated values. This means that you have a set of records, one on each line, where each value (or column) in a record is separated by a comma.
The US government releases all kinds of interesting data for free. One such source of data is the US Census which occurs every ten years. They are already getting ready for the next one in 2010. But the data from the 2000 Census is available on the web.
Say you wanted to know how popular your last name was. How does it rank against all the other surnames in the US? This file has all the surnames collected in the last census where at least 100 people had that last name. See also this web page.
Here's a program that reads through that file, using the ruby File object, looking for the name you entered. Note that we are using a new String method called split. String.split takes a string and turns it into an array of strings by breaking it apart using a separator that you supply. Since we are looking at comma-separated values, we use a comma as the separator.
So here's the program.
filename = 'app_c.csv' file = File.open(filename) puts "What's your surname?" your_name = gets.chomp.upcase found = false first_line = true file.each do |line| if first_line first_line = false next end data = line.chomp.split(',') name = data[0] rank = data[1] if your_name == name puts "Your surname is ranked #{rank}th among the names of count >= 100 in the 2000 Census" found = true break end end puts "I could not find your surname among the names of count >= 100 in the 2000 census" unless found
Here's another version that uses a Hash. How are these programs different?
filename = 'app_c.csv' file = File.open(filename) name_data = Hash.new first_line = true file.each do |line| if first_line first_line = false next end data = line.chomp.split(',') name = data[0] rank = data[1] name_data[name] = rank end puts "What's your surname?" name = gets.chomp.upcase if name_data.has_key?(name) rank = name_data[name] puts "Your surname is ranked #{rank}th among the names of count >= 100 in the 2000 Census" else puts "I could not find your surname in among the names of count >= 100 in the 2000 census" end
Writing Files
Now lets try our hand at writing files. Notice how you opened a file for reading with the File.open method? You can open a file for writing with the same method. You just need to add an optional parameter to the method to indicate that you want to write to the file instead of just reading it. Once you have an open file, you can call the puts method on it to write strings to it.
Note that I have to close the file when I'm done writing stuff to it.
I found a file listing the US Presidents and I wanted to know how their names ranked. Here's the code.
president_file = 'presidents.txt' name_file = 'app_c.csv' president_surnames = [] name_data = {} File.open(president_file).each do |line| name = line.split(',').first surname = name.split(' ').last president_surnames << surname end first_line = true File.open(name_file).each do |line| if first_line first_line = false next end data = line.chomp.split(',') name = data[0] rank = data[1] name_data[name] = rank end output = File.open('president_name_ranks.txt', 'w') president_surnames.each do |surname| if name_data.has_key? surname.upcase output.puts "#{surname},#{name_data[surname.upcase]}" else output.puts "#{surname},0" end end output.close