Skip to content

Exploring Latest ruby-lib Additions

Recently I was browsing through the NEWS file in the Ruby Subversion repository. Two methods added since the 1.9.1 release to the Enumerable module caught my attention. These two are Enumerable#chunk and Enumerable#slice_before. The docs for these are not available at http://ruby-doc.org/ so you’d need to go straight to the source code (by using the links above). The source code comments have a few usage examples but I wanted to expand on that, since I feel like they are if not crucial but certainly very useful additions to the set of excellent Enumerable/Array methods Ruby already provides. From this post you can learn how to:

Enumerable#chunk

Using this method you can chunk consecutive elements of the array into the sub-arrays based on some property calculated for the element by the passed block. The returned array has Array#assoc semantics, i.e. the first value of each sub-array is the property on which the chunk was selected and the second is the actual array of consecutive elements having this property. Suppose we’re implementing something like this Facebook feature: Here similar updates from the same user are collapsed so that you get less spam from your overly active friends. For the sake of demonstration we can assume that there are no special similarity rules and all updates from the same user are considered similar, assuming that you have Update model which stores the update related info (like who did what and on which date) you can chunk the similar updates together using the following code in controller:
class UpdatesController
  def index
    # since we're not interested in the property we're chunking the array on (user here),
    # we get just the chunks themselves with +map(&:second)+
    @chunked_updates = Update.order("created_at DESC").chunk(&:user).map(&:second)
  end
end
* Note this example requires Rails 3 and Ruby 1.9.2 to run properly And in the corresponding view:
<!-- views/updates/index.html.erb -->
<% @chunked_updates.each do |update_chunk| %>
  <%= render update_chunk.first %>
  <% if update_chunk.size > 1 %>
    Show <%= pluralize update_chunk.size - 1, "similar update" %>
    <!-- ... render additional updates here ... -->
  <% end %>
<% end %>

Enumerable#slice_before

This method is particularly when parsing files with the flat structure like:

Section 1 Header
Section 1 contents
...
Section 2 Header
Section 2 contents
...
Using slice_before you can extract such sections/slices. Turns out there are quite a few places where this could be useful: But probably the most useful of them is the following.

Structure aware log grep

Suppose you’ve got a production Rails app running with multiple clients/search engines making requests simultaneously. Now you need to investigate the issue with one particular client, or you want to monitor requests hitting some particular controller. Doing tail -f log/production.log outputs a lot of requests you’re not interested in. You can pipe the output to grep, but then you’d get one line per request, while the whole request log consists of multiple lines. One solution would be to use --context grep option which will show fixed amount of lines around the line where the match occurred, but the problem is that the amount of context lines is fixed while the amount of lines in request log may vary (for example if exception was thrown there will be the whole stacktrace, while for normal request there will be only about 3-4 lines). More robust solution would be to parse the log file into “slices” on the fly and show only ones you’re interested in. Here is the gist and some usage examples:

Show only local requests: 
tail -f log/production.log | grep_rails_log_file.rb 127.0.0.1

Show only GET requests: 
tail -f log/production.log | grep_rails_log_file.rb GET

Show only requests to /updates path: 
tail -f log/production.log | grep_rails_log_file.rb /updates

Use it right now

And the best part is that you can start using all the goodness right now even if your project is running on older Ruby version. The excellent Backports Library by Marc-André Lafortune already has these methods backported. Usually I don’t include the whole library but just rip out the needed method and place it under lib/core_ext folder.
  • @Manuel, Enumerable#group_by examines the whole collection grouping elements from anywhere in the collection based on the value computed by the block (and returning Hash), while Enumerable#chunk groups/"chunks" only the consecutive elements returning Array of Arrays (structure which can be searched with Array#assoc).

    Probably better illustrated with example:


    > [1,2,1].chunk { |i| i }.to_a
    => [[1, [1]], [2, [2]], [1, [1]]] # 3 "chunks" are returned

    > [1,2,1].group_by { |i| i }
    => {1=>[1, 1], 2=>[2]} # only 2 groups are detected
  • manuelmeurer
    How is Enumerable#chunk different from the existing Enumerable#group_by?
    http://apidock.com/rails/Enume...
  • Hey Nick, thanks, my first comment :)

    I tried to add comments to the Gists were seems appropriate, but for MySQL dumps and log files the output examples would be really long and not really descriptive.
  • nphoffman
    Thanks for the investigation and write-up, Evgeniy. It'd be great if some output from using your Enumerable#slice_before examples were shown.

    Cheers,
    Nick
blog comments powered by Disqus