Sunday, February 03, 2013

Hatter - An App to turn text files into ePub books

I am not happy with the ePub files I download for titles at Project Gutenberg. They seem like bundled up versions of the text files that have been split arbitrarily. I want something broken on chapter boundaries the way the author wrote the book. So I wrote an application to generate decent ePubs from a text file. I got started so I'd have a tool to convert files from Project Gutenberg, but it will work on any text file.


Here you can see that I have opened "Walden" by Henry David Thoreau. I haven't done much processing on it beyond adding some <h1> tags for chapter titles and some <table>s for tabular data. (Actually, I did have to do some more work on the file. Project Gutenberg text files include randomly uppercased words and double dashes --. To make a clean file you have to convert these markers. The text file is still readable with the markers, but looks really ugly in an ePub.)
The blue markers on the line numbers to the left designate where a one section ends and a new section starts. Click a line number to add a marker. Clicking on a marker allows you to specify text to use as a Table of Contents entry:


You don't have to mark paragraphs. Any block of text is treated as a paragraph unless it starts and ends with an <h1>, <h2>, <blockquote> or a <table> HTML tag. Blocks wrapped in those tags are just inserted as is. Other blocks are wrapped in a tag. You can insert any valid HTML tags withing blocks and they will be left as is. You can make something bold by surrounding it with <b> tags or italic with <i>.

You need to set a Title and Author:


A publisher and identifier can be set, but they are optional. The identifier is usually the book's isbn, but you can use whatever you'd like. If you don't set an identetifier Hatter will generate a UUID and use that as the book's identifier. You can also add a cover image by dragging an image to the Cover Image well.

Hatter generates ePubs that validate using epubcheck version 3.0. The ePub files also include an NCX entry so older readers should be able to read the files too.

I've started another post with sample ePubs here: Hatter SampleBooks As I finish new sample books I'll add them to that post.

Some things I am working on:

1) Being able to edit the book's css. For now you get some generic default values that look OK. - done
2) Add the ability to add other resources that are included in the book. This would allow you to reference fonts and images in the text. - done
3) Make the UI not so ugly. - ongoing.

I hope to make Hatter available soon on the Apple App store. It will probably be $10 or so.

Oh, and in case you think you are going to create eBooks using files from Project Gutenberg and upload them to the iBooks store, read the license on the Project Gutenberg files. You can create free books, but not for sale ones. (Well, actually you can. But you have to donate 20% back to Project Gutenberg, which isn't a bad thing.)

Update Feb 16, 2013: I think everything is done for the 1.0 release. I just need to do some more testing and submit to the Apple App Store.

Update Feb 23, 2013: Version 1.0.0 is available on the app store. Version 1.0.1 coming soon with the ability to load and save Hatter documents to iCloud and a menu item to show the Getting Started Guide in case users need to refer back to it.

Update Mar 1, 2013: Uploaded version 1.0.1 to the app store. Fixed an issue creating ePubs that have only one section. (That was embarrassing.) Version 1.0.1 also has iCloud support and a show the Getting Started Guide menu item under the Help menu.

Update Mar 17, 2013: I rejected the v1.0.1 since Apple was taking a LONG time to review it. I guess they are getting behind. I uploaded a new binary that has more features, and more bug fixes. I've been working with a small Taiwanese publisher to fix some issues related to generating a Chinese ePub.

The next thing I'm working on is moving ePub generation off the main thread. Most ePubs generate in a second or two. Les Misérables takes several minutes and shows why I need a status window.

Update Mar 29, 2013: After two rejections for being able to save documents to iCloud I give up and turn off iCloud support. (I'm not the only one having their app rejected for this.) 1.0.1 is submitted again. It has MANY bug fixes, building ePubs is on a separate thread and building ePubs is very fast. A second or two for Les Misérables.

Update Oct 6 2013: Hatter is at version 1.0.4. It has a lot more stability and is significantly better at creating ePubs than version 1.0. Most of the credit for making that happen goes to Fred Jame who runs a publishing company in Taiwan. http://puomo.tw/ He pushed me to support a lot of features I wouldn't have thought to. Like vertical text and epub:type attributes for asides in the text body.

Since 1.0 Hatter has gotten a live preview mode so you can see how the section will look in the final ePub. Live preview is really useful for seeing how css changes will look without having to build an ePub and load it onto a device.

The next big feature is the ability to add and edit items in the Tag Palette. The tag palette is a table of buttons you can use to enter blocks of text in a way that makes sense for a type of tag. h tags go around a line, blockquote tags go around a block of text, i tags go around the current selection. You can export and import a tag palette so you can share a it with a group. Editing tag palettes will come in version 1.0.5, which should be out soon.