From The Blog

Easy PDF Generation with Ruby, Rails, and HTMLDOC

For a recent project we needed (and wanted) a simple solution to generate PDF files. Ideally, the solution would use HTML for the general layout and...

For a recent project we needed (and wanted) a simple solution to generate PDF files. Ideally, the solution would use HTML for the general layout and design of the generated PDF, working just like a normal view in Rails.

After testing a number of potential PDF solutions we came across a neat little library called HTMLDOC. What it does is take basic HTML and converts it to PDF, among many other output formats.

For the situation and solution we wanted it does a great job, especially for the price. To make it even easier to use, there is a also a Ruby HTMLDOC Gem to use along with it. Score!

To generate the PDF files we used plain old HTML, something we were familiar with. Exactly like creating a normal rhtml view.

Below follows our experience installing and using HTMLDOC to get PDF file generation out of our Rails application. This has been tested and used on Linux and MacOS X 10.4.9.

Note: You will need the proper tools installed to compile HTMLDOC. On MacOS X, that usually consists of installing the developer tools.

1. Installing HTMLDOC

The first thing we need to do is get HTMLDOC downloaded, compiled, and installed. Copy and past the following in your console:

curl -O http://ftp.easysw.com/pub/htmldoc/snapshots/htmldoc-1.9.x-r1521.tar.gz

tar zxvf htmldoc-1.9.x-r1521.tar.gz

cd htmldoc-1.9.x-r1521

./configure --prefix=/usr/local

make

sudo make install

2. Install the HTMLDOC Gem

Now that we have the HTMLDOC application ready to go, we want to install the HTMLDOC Ruby Gem to interface HTMLDOC with our Rails application:

sudo gem install htmldoc

3. Configuring Your Application

Next we need to configure our application. Open up your config/environment.rb file and add the following (to the end):

Mime::Type.register 'application/pdf', :pdf
require 'htmldoc'

Note: There is a way to make Rails handle the ‘.pdf’ extension format, but when we tried it, it kept asking us to download a file no matter what format we requested on every page. After many attempts at trying to rectify the issue, we eventually decided on the following solution:

4. PDF Renderer

We also added a method to our app/controllers/application.rb file to help DRY up the PDF generation, sort of like the render methods already included in Rails:

def render_to_pdf(options = nil)
  data = render_to_string(options)
  pdf = PDF::HTMLDoc.new
  pdf.set_option :bodycolor, :white
  pdf.set_option :toc, false
  pdf.set_option :portrait, true
  pdf.set_option :links, false
  pdf.set_option :webpage, true
  pdf.set_option :left, '2cm'
  pdf.set_option :right, '2cm'
  pdf << data
  pdf.generate
end

Just pass it the same options you would pass render. Check the HTMLDOC Gem rdoc page for more options and configurations.

Example

Here is an example controller method:

  def index
    @items = Item.find(:all)
 
    respond_to do |format|
      format.html # index.html
      format.xml { head :ok }
      format.pdf { send_data render_to_pdf({ :action => 'index.rpdf', :layout => 'pdf_report' }) }
    end
  end

Pretty typical Rails, no? We tell it to explicitly use that action/view and to use a different layout file.

Now an example of a view:

<h3>Showing: <%= pluralize(@items.size, 'item') %></h3>
<table><tbody>
<tr>
<th>Field1</th>
<th>Field1</th>
<th>Field1</th>
</tr>
<% @items.each do |item| %>
<tr>
<td><%= item.field1 %></td>
<td><%= item.field1 %></td>
<td><%= item.field1 %></td>
</tr>
<% end %>
</tbody></table>

Maybe it’s just me, but that sure beats using the examples using the PDF Writer plugin. At least from what I have seen.

Finally, to generate a link to the PDF, assuming you are using restful routes:

 
<%= link_to 'PDF', formatted_items_path(:pdf) %>

HTMLDOC may not be perfect, but I found it’s ease of use to generate nicely formatted PDF files far outweighed it’s limitations. I hope you found this useful.

Update: 12-26-2007

I made a little helper method for images also, put this in your application_helper.rb:

def pdf_image_tag(image, options = {})
  options[:src] = File.expand_path(RAILS_ROOT) + '/public/images/' + image 
  tag(:img, options)
end

Other Posts That Might Interest You

  1. Installing Ruby, Rails, and MySQL on Mac OS X with Macports
  2. Install Ruby, Rails, and MySQL on Mac and Windows
  3. Documentation in your Rails app

Tags: 

  • Gary
    Just what I was looking for. Works like a charm!!! thanks!
  • ticomaurio
    Hello,

    I'm having difficulty installing htmldoc-1.9.x-r1629 on ubuntu 9.10

    when I write "make" I get the following:

    Making all in htmldoc...
    Compiling htmldoc.cxx...
    htmldoc.cxx: In function ‘void parse_options(const char*, int (**)(hdTree*, hdTree*, hdTree*))’:
    htmldoc.cxx:2201: warning: ‘font_family’ may be used uninitialized in this function
    htmldoc.cxx:2202: warning: ‘font_style’ may be used uninitialized in this function
    htmldoc.cxx:2203: warning: ‘font_weight’ may be used uninitialized in this function
    htmldoc.cxx: In function ‘int main(int, char**)’:
    htmldoc.cxx:522: warning: ‘font_family’ may be used uninitialized in this function
    htmldoc.cxx:523: warning: ‘font_style’ may be used uninitialized in this function
    htmldoc.cxx:524: warning: ‘font_weight’ may be used uninitialized in this function
    Compiling array.cxx...
    Compiling entity.cxx...
    Compiling file.cxx...
    file.cxx: In static member function ‘static char* hdFile::basename(const char*, char*, size_t)’:
    file.cxx:111: error: invalid conversion from ‘const char*’ to ‘char*’
    file.cxx: In static member function ‘static char* hdFile::localize(char*, size_t, const char*)’:
    file.cxx:585: warning: deprecated conversion from string constant to ‘char*’
    file.cxx:587: warning: ignoring return value of ‘char* getcwd(char*, size_t)’, declared with attribute warn_unused_result
    make[1]: *** [file.o] Error 1


    and when I write "make" I get the following:

    Making all in htmldoc...
    Compiling file.cxx...
    file.cxx: In static member function ‘static char* hdFile::basename(const char*, char*, size_t)’:
    file.cxx:111: error: invalid conversion from ‘const char*’ to ‘char*’
    file.cxx: In static member function ‘static char* hdFile::localize(char*, size_t, const char*)’:
    file.cxx:585: warning: deprecated conversion from string constant to ‘char*’
    file.cxx:587: warning: ignoring return value of ‘char* getcwd(char*, size_t)’, declared with attribute warn_unused_result
    make[2]: *** [file.o] Error 1
    Installing in fonts...
    Installing font files in /usr/local/share/htmldoc/fonts...
    Installing in data...
    Installing in doc...
    Rebuilding documentation...
    Formatting htmldoc.html...
    /bin/sh: ../htmldoc/htmldoc: not found
    make[1]: *** [htmldoc.html] Error 127

    Is there any plans to fix it???

    Thank you.
    Tico Mauriño
  • Hi,

    I'm having difficulty installing htmldoc as instructed in the article on linux.

    First I had to change this line 'curl -O http://ftp.easysw.com/pub/htmldoc/snapshots/htm...
    to the latest version which I believe was r1629.

    The rest of the steps went smoothly except when I try to install with 'sudo make install' I get the following:

    aking all in htmldoc...
    Making all in doc...
    Formatting htmldoc.html...
    INFO: Reading intro.html...
    INFO: Reading 1-install.html...
    INFO: Reading 2-starting.html...
    INFO: Reading 3-books.html...
    INFO: Reading 4-cmdline.html...
    INFO: Reading 5-cgi.html...
    INFO: Reading 6-htmlref.html...
    INFO: Reading 7-guiref.html...
    INFO: Reading 8-cmdref.html...
    INFO: Reading a-license.html...
    INFO: Reading b-book.html...
    INFO: Reading c-relnotes.html...
    INFO: Reading d-compile.html...
    make[2]: *** [htmldoc.html] Segmentation fault
    make[2]: *** Deleting file `htmldoc.html'
    Installing in fonts...
    Installing font files in /usr/local/share/htmldoc/fonts...
    Installing in data...
    Installing in doc...
    Formatting htmldoc.html...
    INFO: Reading intro.html...
    INFO: Reading 1-install.html...
    INFO: Reading 2-starting.html...
    INFO: Reading 3-books.html...
    INFO: Reading 4-cmdline.html...
    INFO: Reading 5-cgi.html...
    INFO: Reading 6-htmlref.html...
    INFO: Reading 7-guiref.html...
    INFO: Reading 8-cmdref.html...
    INFO: Reading a-license.html...
    INFO: Reading b-book.html...
    INFO: Reading c-relnotes.html...
    INFO: Reading d-compile.html...
    make[1]: *** [htmldoc.html] Segmentation fault
    make[1]: *** Deleting file `htmldoc.html'


    Do these segmentation faults mean it has failed? When I do a search for htmldoc on the file system it doesn't appear in the /usr/local/share directory.

    Any help would be greatly appreciated. Thanks in advance!
  • mgauthier
    Does anyone know where to find additional documentation for the htmldoc ruby gem. I have found this site http://htmldoc.rubyforge.org/, but I am looking for somewhere that will list all of the options. For example I used pdf.set_option :title, false but I just guessed that that was the correct option to remove the title page. Is there a good site that lists all of them?
  • itjobs1
    I've got some code using PDF::Writer, and managing the templates is a pain. If I could design the templates using normal HTML views, and then render them as PDF with some reasonable control over the style, I'd be a happy man.
    www.staffingpower.com
  • Wolfram
    Sorry this one got messed up. So, the special comments are > !-- NEW_PAGE-- <, > !-- FOOTER LEFT "foo" -- <.
  • Wolfram
    Nice article, thanks a lot!
    I'm trying to include special htmldoc comments, such as <!-- NEW PAGE -->, <!-- FOOTER LEFT "foo" --> and so on, but failed so far. Does anyone have a hint?
  • Torpedo
    Thank You!!!

    Torpedo Gratis
  • sildur
    A suggestion: if you are dealing with an UTF-encoded website, you should replace the line:
    data = render_to_string(options)
    with the following line:
    data = Iconv.conv('ISO-8859-15//IGNORE//TRANSLIT', 'utf-8', render_to_string(options))
    This is needed because htmldoc (at least v1.8) doesn't support UTF
  • Nuno Prata
    Hello m8. In first place thank you for the great gem/plugin.

    I'm currently having a problem. I'm trying to apply a body image to my pdf's using the method:
    pdf.set_option :bodyimage, "report_bg.jpg"

    It's not that the image doesn't display right, it just doesn't display at all...

    Any ideas? Tks in advance
  • Having some issues with HTMLDoc, Rails 2.2.2 and Phusion Passenger - Apache. With the GEM installed, passenger refuses to start up with an "Unknown error" and doesn't log anything so debugging is tough. Have you had any successful implementations on Rails 2.2.2 with Passenger? Suggestions on how to proceed?
  • mgauthier
    Hi Robert,

    I had this problem as well and it was because my environment.rb did not have the correct lines in it....

    "I placed Mime::Type.register 'application/pdf', :pdf at the end of my environment.rb file, as in outside of the "Rails::Initializer.run do |config|" block

    I placed require 'htmldoc' at the beginning of the environment.rb file, also outside of the "Rails::Initializer.run do |config|" block
    "

    Hope that helps.
  • Hi Robert. We haven't tried HTML doc with passenger. We have successfully used Prawn with that setup though. I'd give that a try.
  • Yves-Eric
    Nice article, I had things up and running in a few minutes...

    But show stopper for me: HTMLDOC does not support CJK (Chinese, Japanese, Korean) languages.
  • Fernando
    Hi, I want to know how can I set the page as landscape.
  • kams
    Hello,

    getting error as for rails application.
    "'htmldoc' is not recognized as an internal or external command,\noperable program or batch file.\n"

    Thank you
  • bharati
    getting file does not begin with '%pdf-' error.. Can anyone help me..
  • font
    got the soltn for this , am facing the same.. can anyone help ?
  • saving for later
  • def self.generate_pdf(url, links=false)
    doc = Document.new
    doc.mime = 'application/pdf'
    pdf = PDF::HTMLDoc.new
    pdf.set_option :bodycolor, :white
    pdf.set_option :links, links
    pdf.set_option :webpage, true
    pdf.set_option :path, "#{RAILS_ROOT}/public/"
    pdf << url
    pdf.footer ".1."
    if pdf.generate
    puts "Successfully generated a PDF file"
    doc.body = pdf.generate
    else
    puts "ERROR!--------------------------------------------"
    for error in pdf.errors
    puts error
    end
    end
    doc
    end

    The important line is the pdf.set_option :path, #{RAILS_ROOT}/public/"

    It tells htmldoc to look in your public folder for images.

    If you manually create the img tag this will work and the images will be the same on the web as in the pdf. Using image_tag however will not work, because rails adds on the uid to each image.

    If you were adventurous, you could parse out the uid from the tags with gsub. Now if I could just figure out how to keep the errors without having to generate twice.
  • Just a note for Ubuntu users: HTMLDOC must be installed by hand (configure, make, make install as stated here by author): if you install it with apt-get then HTMLDOC is not working in rails :)
    With Debian is fine to install htmldoc with apt-get install.

    The patch for images is necessary in both Debian and Ubuntu.

    Thank you *very much* to all the people who shared their knowledge here :)
  • dafinn:

    I've got an app that does the same. I wrote a very simple helper that expands the image paths.

    def write_full_image_path(text)
    newtext = text.gsub('img src="', 'img src="' + File.expand_path(RAILS_ROOT) + '/public/')
    newtext
    end

    Then, in my .rpdf files, I call for write_full_image_path(model.formatted_text)
  • dafinn
    First of all, Great work! Due to the Patch of Christoph printing Images with a complete path is working now!
    BUT... I am storing Articles in my Database. I use FCKEditor for the User-interface - the text is stored HTML formatted and the path for Images is stored in the HTML Code in form of "/public/uploads/Images" and so on. I have explored that HTMLDOC needs the full path(as some People above). For Example http://0.0.0.0:3000/public/uploads/Images ...
    Has anyone an idea how to fix this? I have seen the pdf_image_tag but i am not sure how it works...
  • Hey, that has me very helped. Thanks!
  • srishti
    How can we use html doc in rails 1.2.3
  • sorry to ask what may be a simple answer, but how do you apply that patch?

    thx...
  • Christoph, YOU ROCK.

    Thanks for the patch!

    I was getting the exact same error when using the pdf_image_tag helper. Applied the patch and viola! Image renders perfect.
  • I tracked it down to some weird htmldoc output, which contains whitespaces sometimes.

    Here is a patch that solves my problem:
    http://textmode.at/2008/5/14/ruby-htmldoc-gem-f...

    Not sure if it's the same problem as the user above has.
    I contacted the author htmldoc people.

    greetings
  • It sounds like you are missing something. I would have to see your code to help more I think. Sorry.
  • same error here, false class when using an image... any solution?
  • Kelly
    As an addition when I debug it I get here in the streaming and I see data is null..

    ********************
    def send_data(data, options = {}) #:doc:
    logger.info "Sending data #{options[:filename]}" unless logger.nil?
    send_file_headers! options.merge(:length => data.size)
    @performed_render = false
    ********************
    is there a way to pass the record I have when I call this?
    right now we do send_data render_to_pdf but is there a way to say @object so I can pass the object created by the finder? send_data @object.render_to_pdf does not work either as I get a missing method render_to_string that way
  • Kelly
    Hi Chris. Say quick question.. I have everything wired up but I'm actually calling the render_to_pdf from another controller in my app. It work fine but give me this error.
    undefined method `size' for false:FalseClass
    I walk thru it with the debugger and it seems that data is nil so it can't do a size. I assume that is because the model for my control is not an option to post to the render_as_pdf but not sure.
  • Matt
    Hi Chris,

    Thanks for that. The trouble is (and I should have mentioned) that it's on Windoze. If I set it to 'c:\rails\project\public\images\some_image.jpg', load the page directly and save it to disk and then load the page in explorer it looks fine (that is, it loads the image/s from disk). But the Ruby htmldoc plugin still doesn't get them.

    I've also tried prepending file:/// and a few other things.

    //matt
  • Matt,

    I found that I had to use absolute path to the image. Thus, for example:

    /Users/chris/src/my_rails_app/public/images/some_image.png


    I hope that helps.
  • Matt
    Hi all, am I the only one who can't ever get images to be included in the document using PDF::HTMLDoc?

    If I run HTMLDOC from the command line with supposedly the same arguments (basically just --webpage) then it works a treat, even if I point it at the URL for my dynamically generated rails page. However if I call it using the Rails GEM then it renders everything except the images.

    It's driving me bonkers and pretty urgent. I've tried all sorts of things including the pdf_image_tag helper above with numerous modifications (eg. adding file:/// at the start).

    Cheers,
    Matt
  • @Nidhika: ensure that you are using the ' mark and not the full quotation mark if you copy/pasted from the post.
  • Nidhika
    Whem I am puting
    " Mime::Type.register 'application/pdf', :pdf
    require 'htmldoc'"

    In config/enviroment.rb file in the last webrick server not restart. How am I solve this issue.
    Is anybody help me

    Thanks
    Nidhika
  • mgauthier
    I placed Mime::Type.register 'application/pdf', :pdf at the end of my environment.rb file, as in outside of the "Rails::Initializer.run do |config|" block

    I placed require 'htmldoc' at the beginning of the environment.rb file, also outside of the "Rails::Initializer.run do |config|" block

    Hope that helps...

    mg
  • @Dave:

    send_data has an option for filename.

    send_data render_to_pdf({ :action => 'index.rpdf', :layout => 'pdf_report' }), :filename => "foobar.pdf"


    Chris
  • Dave
    Chris, thanks for the clarification on the routing for me. Much obliged.
  • Dave
    also, how do you name the pdf
  • @Dave:

    It comes from restful routes. Example:

    map.resources :posts

    gives you:

    posts_path as well as formatted_posts_path(:format)

    Try rake routes within in your project directory to see all your routes.

    Chris
  • Dave
    WTF is formatted_items_path? I can't find documentation on this anywhere.
  • Chris W
    If you have to have css support and don't mind spending some money, check this solution out.

    HTML / CSS to PDF using Ruby on Rails
    http://sublog.subimage.com/articles/2007/05/29/...
  • Dan,

    I will try and post an example PDF tomorrow morning. However, I can tell you it looks pretty close to what a normal HTML page would look like.

    @Robert Shell: I tested images and you give it the full path. For example /Users/chris/Projects/railsapp/public/images/logo.gif


    Chris
  • Dan Kubb
    Would you mind posting an example of what the example view renders to when rendered as PDF?

    I've got some code using PDF::Writer, and managing the templates is a pain. If I could design the templates using normal HTML views, and then render them as PDF with some reasonable control over the style, I'd be a happy man.
  • Jamie,

    I am not entirely sure. I believe the website states the nightly builds have at least some support for CSS.


    Chris
  • Looks a nice alternative to PDF Writer. Does is take into account CSS?
  • Robert,

    I had the same problem with images, but have not looked into it yet as it was not of primary concern.
    Chris
  • Lovely approach, used the same in my projects. Only really problem Have had is with embedded, generated images which are sometime lost as if htmldoc expects file paths and not urls. Have on todo to look at fixing this with some pre caching.

    Tried a number of other approaches before this and definitely the best and really easy to implement

    Robert.
blog comments powered by Disqus