Showing posts with label LaTeX. Show all posts
Showing posts with label LaTeX. Show all posts

Wednesday, March 25, 2015

ASU Dissertation Template Now on ShareLaTeX

I have already posted about the LaTeX template that I developed for Arizona State University (ASU) theses and dissertations. I was and am highly motivated to help ASU grad students get through format review quickly and painlessly. In my view, the template is a great step in that direction.

In the README for the GitHub repo for the dissertation template, I invite users to contact me if they need help using the template. Most of the messages that I have gotten are actually requests for help setting up TeX rather than resolving problems with the template. It seems that people that struggle to use the template are stumbling on the first step: getting a TeX distribution up and running on their computers.

Troubleshooting these problems is incredibly difficult from a distance. There are so many things that could be going wrong on the user end. The problem could be with the TeX distribution or my dissertation template; it could be the particular TeX editor that the person is using; or it could be something else entirely. I have to admit I’m not sure how to troubleshoot these problems unless I’m sitting at the user’s computer and able to try it out for myself, but in-person support is not a practical solution.

So I have turned my dissertation template into a ShareLaTeX template. ShareLaTeX is an online LaTeX editor. Anyone can write or upload LaTeX code, and ShareLaTeX will convert it into a PDF—no need for installing TeX on your own computer. I’ve tried it out over the past couple days, and it seems like quite a fast, stable site with some cool features. Creating an account is painless; it just takes an email account and password. I don’t think you even need to confirm your email address.

Once you have an account, you can start a new project based on the ASU dissertation template. Go to the page for the ASU dissertation template, and click “Open in ShareLaTeX”. ShareLaTeX will automatically create a new project for you based on the template.

I have tested the template on ShareLaTeX, and it runs great. So ASU grad students rejoice! You now have an even more user-friendly way to format your dissertations and theses.

ShareLaTeX also provides some tutorial information for people new to LaTeX:

The second point, the documentation page, contains a wealth of information about LaTeX basics, such as making text bold and inserting tables, and more advanced topics, such as integrating R code into your LaTeX documents. (By the way, I’m sort of blown away that ShareLaTeX supports R. This is an incredibly handy feature.)

To recap, the basic steps for using the ASU dissertation template on ShareLaTeX follow:

  1. Create a ShareLaTeX account.

  2. Start a new project based on the ASU dissertation template:
    1. Visit the template page.
    2. Click the “Open in ShareLaTeX” button.
  3. Update the template with your own dissertation information (e.g., author name) and content. (The README file and the template file itself walks you through where and how to update the template.)

  4. Click the “Compile” or “Recompile” button to make your document.

Using ShareLaTeX should make dissertation and thesis formatting easier than it was before. For people who want an entirely painless formatting experience, I offer a typesetting service through my consulting company. I’ll take your LaTeX files and update them so that your thesis or dissertation passes format review. I have found that many people end up coming to me after going through a couple rounds of revisions with the format reviewers at the grad office as graduation deadlines are fast approaching. Please don’t wait to get in touch if you’re stuck on formatting. I’ll help however I can, whether answering questions by email (for free) or doing the typesetting myself (for a fee).

Monday, February 23, 2015

Fixing Sente's Export to BibTeX

I recommended Sente as a reference manager and tool for reviewing literature in a previous post. And Sente is great for both of those tasks (assuming you have a Mac). However, Sente does a poor job when exporting a library to BibTeX format—something you would want to do if you were writing an article using LaTeX. For example, Sente leaves the title field blank for web pages. Fortunately, I have found that I can solve these kinds of problems with a bit of coding.

Outline of the Solution

Sente accurately exports data to its own SenteXML format, so my solution uses the following steps:

  1. From Sente, export the reference library to SenteXML format.
  2. From Sente, export the reference library to BibTeX format.
  3. Using a script,
    1. Read in the SenteXML file.
    2. Read in the BibTeX file.
    3. Loop through each entry in the BibTeX file and
      1. Check if the entry is a web page;
      2. If it is a web page, retrieve the title from the same entry in the SenteXML file and save the new title for the BibTeX entry;
      3. Save the BibTeX entry.

First, note that you must start with a library of references in Sente and then export your library to the two formats listed in steps 1 and 2. Second, note that this procedure could be used to check and modify any field, but in this blog post, I address missing titles for web pages.

Solution Details

Getting Perl

I implemented the solution using a perl script. If you don’t have perl, I recommend installing it using perlbrew. I strongly recommend using perlbrew if you’re using a Mac or a Linux machine because you avoid modifying the version of perl that’s installed by default on your operating system. Some system utilities and other applications might rely on the default perl installation, so if you modify it, these utilities and applications might break. See perlbrew’s web page for installation instructions, but you can likely install it with the following command:

\curl -L http://install.perlbrew.pl | bash

Then, to install the latest stable version of perl, enter the following command:

perlbrew install perl-5.16.0

Setting Up Perl

The script uses a few external perl modules, which need to be installed. Assuming that you’re using perlbrew with version 5.16.0 of perl, enter each of the following lines on the terminal (and wait after each finishes before entering the next):

perlbrew exec --with perl-5.16.0 cpanm XML::Simple
perlbrew exec --with perl-5.16.0 cpanm Text::BibTeX
perlbrew exec --with perl-5.16.0 cpanm Getopt::Long

Using the Script

You can retrieve the script and some example files here:

The GitHub repository is here, and you can download the entire repository as a ZIP file here.

To use the script, put all three files in the same folder. If you’re using your own files (instead of the example files above), make sure to move your files into the same folder as the script. You need to either rename your own files to the default file names used in the script or use command-line arguments to specify the file names that you are using (see below). The default file names are the ones used in the example files:

  • SenteXML file: references.xml
  • Original BibTeX file: references.bib
  • Updated BibTeX file: references_new.bib

Next, open a terminal and run the script with the following command (again, assuming that you have used perlbrew to install version 5.16.0 of perl):

perl5.16.0 repair.pl

The script will generate a file called references_new.bib which contains all the same references as the original file and correct titles for web page entries.

The script also accepts the following command line arguments:


--sente-file  FILENAME     Name of the SenteXML file

--bib-infile  FILENAME     Name of the original BibTeX file
                           
--bib-outfile FILENAME     Name of the updated BibTeX file
                           

For example, if you want to save the updated file as bibliography.bib instead of references.bib, enter the following command in the terminal:

perl5.16.0 repair.pl --bib-outfile bibliography.bib

You could also directly overwrite the old BibTeX with the updated one:

perl5.16.0 repair.pl --bib-outfile references.bib

But be careful because the old BibTeX file will be gone forever.

Next Steps

The script is a straightforward procedural script that loops through entries in a BibTeX file and makes changes according to some simple instructions. It’s not sophisticated, and for a simple task such as this, it doesn’t need to be. But there are many other ways in which an author might want to fix up a BibTeX file.

When I wrote my dissertation, I used a more elaborate version of this script to do all of the following:

  • convert titles to headline-style capitalization according to the Chicago Manual of Style,
  • correctly indicate the translator(s) of sources,
  • correctly alphabetize institutional authors with hyphens in their names (e.g., “UN-HABITAT”),
  • clean up the edition field, which inconsistently contained ordinal numbers, “ed.”, and “edition” in my reference library,
  • clean up US state names (e.g., by removing internal periods), and
  • insert missing titles for laws and statutes.

With these tasks, it might make sense to create some more general methods—for example, a general method that retrieves the contents of a field (such as the title field), sends it to a regex, and then updates the field with the result.

Advantages of Scripting

In general, I found that using perl to fix up the bibliographic database was very efficient. I did not want to manually update the BibTeX database because it was very large (over 2,000 entries) and because I was using Sente as the actual reference manager. If I had updated the BibTeX file manually, I would probably have to re-export to BibTeX any entries from the Sente library that I modify, overwrite the old versions in the BibTeX file, and then manually update those entries in the BibTeX file as needed (e.g., entering titles for websites). That is way too much manual work because I was constantly adding new entries to my Sente reference library and occasionally updating older entries.

This approach is also more flexible. If I were to manually update my entire database so that (for example) all titles conform to Chicago-style headline capitalization, I would have to edit everything again to use those same entries in a document that needs to conform to a different style guide. By implementing these changes with a script, I left myself a relatively easy way to change the formatting for different documents.

Monday, November 3, 2014

Using Macros with "titlecaps" in a LaTeX Document

The LaTeX template that I wrote for Arizona State University’s dissertations and theses makes use of the titlecaps package. The titlecaps package has commands for selectively capitalizing words, so that titles can be correctly capitalized in what is often called headline style. Titles of chapters and sections, for example, should usually appear in headline style.

Style guides vary a little bit on the particular rules for headline style, but in general, the rules are the following:

  • Always capitalize the first and last word.
  • Capitalize all remaining words unless they are:
    • articles (i.e., a, an, the),
    • prepositions (e.g., of and for), or
    • conjunctions (e.g., and, but, and or).

Any particular style guide should give more details about exactly which words should not be capitalized.

In any case, the titlecaps package can be used to perform this capitalization on a string of text. Here is an example of a memoir class document that shows how to use the titlecaps package:

\documentclass{memoir}
\usepackage{titlecaps}
\Resetlcwords
\Addlcwords{a an the} 
\Addlcwords{and but for or nor} 
\Addlcwords{aboard about above across %
  after against along amid among anti %
  around as at before behind below %
  beneath beside besides between %
  beyond but by concerning considering %
  despite down during except excepting %
  excluding following for from in %
  inside into like minus near of off %
  on onto opposite outside over past %
  per plus regarding round save since %
  than through to toward towards under %
  underneath unlike until up upon %
  versus via with within without}

\renewcommand{\cfttableaftersnumb}% 
  {\titlecap}%                      
  
\begin{document}

\listoftables
\clearpage

\begin{table}[h] % Table float
\caption{Here is a table caption ending in ``of''}
\label{table1}
\begin{tabular}{l c c} \\ \hline
Column1 & Column2 & Column3 \\ \hline
Row1 & 2.0 & 3.0 \\
Row2 & 2.0 & 3.0 \\
Row3 & 7.0 & 8.0 \\ \hline
\end{tabular}
\end{table}

\end{document}

This setup will send the \titlecap command to all the table captions in the list of tables, which should appear after the table of contents. The table captions in the running text will not be changed.

As far as I can discern, the titlecaps package is relatively new; it appears that it was written in 2013 or maybe 2012, so it still has some rough edges. (Then again, LaTeX is pretty old by now, and it’s full of rough edges.)

One problem with titlecaps is that it does not ignore the last word in a title. In the example above, the last word of the table caption of will not be capitalized in the list of tables because of is in the list of words to leave lowercase (defined by the \Addlcwords command). This behavior contradicts some style guides (such as the Chicago Manual of Style), and there does not seem to be a way to change this behavior.

Another problem is that macros in text sent to \titlecap are likely to break. In the above example, all captions are sent to \titlecap. Users might want to refer to another figure or table in a caption (e.g., with the \ref command) or put a citation in a caption (e.g., with \cite). But these macros will break if they’re sent to \titlecap. (By the way, I am not scolding the package author here. Capitalization in LaTeX is tricky, and when text gets sent to a macro for capitalization, odd things can happen.)

So I started searching for solutions. \textnc is a macro from the titlecaps package that causes \titlecap to ignore text within, so it seemed that wrapping a macro with \textnc would be a way to get \titlecap to skip over the macro. Unfortunately, that strategy did not work.

To (sort of) fix this problem, I read through the titlecaps documentation to see how its \textnc macro was defined. The other clue from the titlecaps documentation was that wrapping text with {{}} can also cause \titlecap to leave the text alone. Between this tidbit, the definition of \textnc, and a bunch of trial and error, I arrived at the following macro:

\newcommand{\macrocapwrap}[1]{% 
  {\bgroup\bgroup{{#1}}\egroup\egroup}%
}

Below is an example of this macro in action:

\documentclass{memoir}
\usepackage{titlecaps}
\Resetlcwords
\Addlcwords{a an the} 
\Addlcwords{and but for or nor}
\Addlcwords{aboard about above across %
  after against along amid among anti %
  around as at before behind below %
  beneath beside besides between %
  beyond but by concerning considering %
  despite down during except excepting %
  excluding following for from in %
  inside into like minus near of off %
  on onto opposite outside over past %
  per plus regarding round save since %
  than through to toward towards under %
  underneath unlike until up upon %
  versus via with within without}

\renewcommand{\cfttableaftersnumb}%
  {\titlecap}%

\newcommand{\macrocapwrap}[1]{% 
  {\bgroup\bgroup{{#1}}\egroup\egroup}%
}

\begin{document}

\listoftables
\clearpage

\begin{table}[h] % Table float
\caption{Here is a table caption ending in %
  ``of'' with number \macrocapwrap{\ref{table1}}}
\label{table1}
\begin{tabular}{l c c} \\ \hline
Column1 & Column2 & Column3 \\ \hline
Row1 & 2.0 & 3.0 \\
Row2 & 2.0 & 3.0 \\
Row3 & 7.0 & 8.0 \\ \hline
\end{tabular}
\end{table}

\end{document}

Why does this macro work? I honestly have no idea. As I said, I arrived at it mostly through trial and error. It’s not ideal, and it seems a bit fragile, but I hope it can help other people until something more robust comes along.

Monday, October 20, 2014

Academic Workflow for the Ages (Part 2)

Over two years ago, I wrote a post about an academic workflow, mostly for literature review. I did not focus on writing, although I did recommend that people to use Scrivener to draft their documents (dissertation chapters, journal articles, and so forth). In this post, I discuss an alternate way to draft documents, which I think is much better.

Pandoc

Natural Writing

I recommend that people write their documents in pandoc markdown. Pandoc’s author describes it as “your swiss-army knife” for documents because it can convert between many document formats (e.g., HTML to MS Word).

Markdown is a syntax for plain-text documents that aims to be readable in plain text but also have enough structure that it can be parsed and translated into other formats. (Markdown’s original authors aimed to generate HTML documents from markdown.) For example, the following is a list in a markdown document:

Here is a list written in markdown: 

* Here is an item in the list.
* Here is another item in the list. 
* Here is the final item in the list. 

Even in a plain-text file, it’s clear that this is a list, so it’s easy to read this document in plain text and understand the intended formatting. Additionally, because the syntax is very simple, it’s easy to just write your thoughts, arguments, and whatever else you need to without pausing to deal with formatting. If I wanted to make a list in an e-mail, I would write it exactly as written above. At least for lists, there’s nothing new to learn, and in general, writing in markdown is very natural. Markdown achieves a tremendous separation between content and formatting. Pandoc converts the list above into the HTML shown below.

<p>Here is a list written in markdown:</p>
<ul>
<li>Here is an item in the list.</li>
<li>Here is another item in the list.</li>
<li>Here is the final item in the list.</li>
</ul> 

When writing in pandoc markdown, you can also use HTML comments to make notes to yourself and keep them right next to the material they refer to. For example, you could write an outline of a journal article you’re drafting and enclose it in an HTML comment, so it’s excluded from the output (for most formats) but still at the top of your own markdown document.

<!-- This is an HTML comment. --> 

<!-- 
Comments can also 
span 
several lines. 
--> 

Here is a list written in markdown: 

* Here is an item in the list.
* Here is another item in the list. 
* Here is the final item in the list. 

And because pandoc markdown is stored in plain-text files, you can use a variety of text editors to write them. Personally, I prefer a minimalist writing environment because there are less distractions. MS Word, for example, has so many formatting tools immediately available that it’s tempting to write something and then immediately fix up how it looks. With a plain-text editor, there are virtually no distractions. My favorite at the moment is Text Wrangler on Mac with the font set to display 24pt Helvetica. It’s free and simple, and it does what I need it to.

More Features

Pandoc expands traditional markdown with new features, such as different kinds of lists, different kinds of tables, in-text citations and reference lists, footnotes, and metadata. For example, you can make a table with the following pandoc markdown:

-------------------------------------------------------------
 Centered   Default           Right Left
  Header    Aligned         Aligned Aligned
----------- ------- --------------- -------------------------
   First    row                12.0 Example of a row that
                                    spans multiple lines.

  Second    row                 5.0 Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------

Table: Here's the caption. It, too, may span
multiple lines.

If you want to cite a source, you can write the following:

Blah blah [@smith04; @doe99].

smith04 and doe99 are BibTeX keys in a BibTeX database.1 Pandoc will replace them with proper in-text citations and generate a reference list at the end of the document. You can even use Zotero’s citation styles, so there are a lot of options for automatic formatting, including Chicago, APA, MLA, and formats for many academic journals.2

For all the features and how to use them, spend some time on pandoc’s readme page, and try them out for yourself. Note: The table and citation examples in this section were copied verbatim from pandoc’s readme.

More Formats

Pandoc can convert markdown into formats other than HTML, including LaTeX. Pandoc converts the markdown list above into the following LaTeX:

Here is a list written in markdown:

\begin{itemize}
\itemsep1pt\parskip0pt\parsep0pt
\item
  Here is an item in the list.
\item
  Here is another item in the list.
\item
  Here is the final item in the list.
\end{itemize}

And as I mentioned above, pandoc can also convert markdown to an MS Word document (*.docx file). Do you see where this is going? If you write your documents in pandoc markdown, you can easily convert them to other formats as needed. Lists, tables, and citations will all appear correctly in a variety of formats. If you’re a PhD student and your advisors want to review your work in MS Word (usually because of its handy track changes feature and comments feature), you can just take the current version of your markdown file, convert it to MS Word, and send it out. And when it’s time to make a final version, you can still convert your markdown to LaTeX and apply whatever template you want to (your school’s template, a journal’s article template, or your own custom template). You don’t have to change your source file at all; just apply a new template to get the formatting you need.

A Simple Example

Returning to the list example above, if I have a markdown file named temp.md with the following contents:

Here is a list written in markdown: 

* Here is an item in the list.
* Here is another item in the list. 
* Here is the final item in the list. 

I can open a terminal and convert it to HTML with the following command:

pandoc -r markdown -w html -o temp.html temp.md

Pandoc will create an HTML file called temp.html with the following contents:

<p>Here is a list written in markdown:</p>
<ul>
<li>Here is an item in the list.</li>
<li>Here is another item in the list.</li>
<li>Here is the final item in the list.</li>
</ul>

Reproducible Statistics

But wait. There’s more!

If your work involves statistics and you’re familiar with R, you can write something called R Markdown. R Markdown files can contain both markdown and R code (and R code that writes markdown). So you can write an R Markdown file with your entire statistical analysis that writes out fresh statistics, tables, and figures every time you process the file. If you find an error in your work or you want to update the way a figure looks, just change the code in the R Markdown file and reprocess it. Everything else will be the same plus your changes will have been included.

I won’t get into the details of setting up an R Markdown file here, but basically, you create a file with the *.rmd extension and write in your markdown and R code. Then, you “make” the file by running R on it. There are different ways to do this. You could call R from the command line with Rscript, for example. In any case, the result will be a pandoc markdown file (with a *.md) extension, which you can then convert to other formats (e.g., MS Word, LaTeX, etc.) as you would any other pandoc markdown file. In addition, you can easily share your R code with anyone who wants to check your work.

Disadvantages

A Command-Line Tool

For some people, the fact that pandoc is a command-line tool will be a disadvantage. Some graphical user interfaces (GUIs) for pandoc are listed here. I haven’t tried them, so I can’t make any recommendations.

The LaTeX Writer

One of the footnotes in this post discusses my complaints about the way pandoc handles citations. Another problem is the way pandoc writes out tables in LaTeX. Pandoc automatically writes out tables as longtables in LaTeX and sometimes inserts minipages in the middle of tables. For example, the following table is written in pandoc markdown and saved in a file called temp.md:

-------------------------------------------------------------
 Centered   Default           Right Left
  Header    Aligned         Aligned Aligned
----------- ------- --------------- -------------------------
   First    row                12.0 Example of a row that
                                    spans multiple lines.

  Second    row                 5.0 Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------

Table: Here's the caption. It, too, may span
multiple lines.

Pandoc will convert the document to LaTeX with the following terminal command:

pandoc -r markdown -w latex -o temp.tex temp.md

The above command creates a file called temp.tex with the following contents (although I added some line breaks for formatting):

\begin{longtable}[c]{@{}clrl@{}}
\toprule\addlinespace
\begin{minipage}[b]{0.15\columnwidth}\centering
Centered Header
\end{minipage} & %
\begin{minipage}[b]{0.10\columnwidth}\raggedright
Default Aligned
\end{minipage} & %
\begin{minipage}[b]{0.20\columnwidth}\raggedleft
Right Aligned
\end{minipage} & %
\begin{minipage}[b]{0.31\columnwidth}\raggedright
Left Aligned
\end{minipage}
\\\addlinespace
\midrule\endhead
\begin{minipage}[t]{0.15\columnwidth}\centering
First
\end{minipage} & %
\begin{minipage}[t]{0.10\columnwidth}\raggedright
row
\end{minipage} & %
\begin{minipage}[t]{0.20\columnwidth}\raggedleft
12.0
\end{minipage} & %
\begin{minipage}[t]{0.31\columnwidth}\raggedright
Example of a row that spans multiple lines.
\end{minipage}
\\\addlinespace
\begin{minipage}[t]{0.15\columnwidth}\centering
Second
\end{minipage} & %
\begin{minipage}[t]{0.10\columnwidth}\raggedright
row
\end{minipage} & %
\begin{minipage}[t]{0.20\columnwidth}\raggedleft
5.0
\end{minipage} & %
\begin{minipage}[t]{0.31\columnwidth}\raggedright
Here's another one. Note the blank line between rows.
\end{minipage}
\\\addlinespace
\bottomrule
\addlinespace
\caption{Here's the caption. It, too, may span multiple lines.}
\end{longtable}

Notice that it’s a longtable containing minipages. This looks to me like a bit of madness. My understanding is that longtables are used by default because if they’re not, tables that are longer than a single page will not appear correctly. And I assume that authors complained about this problem in the past leading to the current setup. However, longtables could at least be implemented more elegantly with the tabu package. In general, it seems that more LaTeX formatting should be left to the LaTeX template (which can be changed easily by authors) than the pandoc writer (which cannot be changed easily by authors).

Git

Another benefit of writing documents in markdown—or really any plain-text format—is the ease of using version control. Version control systems, such as git, track changes to files, and with websites like GitHub, it’s possible to keep a remote backup of your files and their changes. Version control is especially useful on long projects, such as books or dissertations, where you may want to keep separate chapters in separate files. And if you’re collaborating with tech-savvy people, you can all work on the same set of files and track changes collectively.

I won’t give a tutorial on git here, and unfortunately, there’s a bit of a learning curve. But here are some useful resources:

The Setup

In summary, the setup I recommend is writing documents in either pandoc markdown or R markdown (depending on whether they contain statistics) and using git to track changes to documents. This setup works very well with long works, such as books and dissertations, where it makes sense to separate chapters into individual files.


Notes:


  1. You need to tell pandoc the name of the BibTeX file when you execute the pandoc command on the command line, and the file needs to be in a folder where pandoc can find it (i.e., in the same folder as the markdown file or in the folder indicated by the --data-dir flag). Again, see the readme for details.

  2. I actually have some gripes with the syntax for pandoc’s in-text citations. It’s convenient for quick documents, but for a formal academic work, pandoc’s citation commands are somewhat lacking, and BibLaTeX’s in-text citation commands are much more powerful. This is a tricky situation because while you can write BibLaTeX macros directly in pandoc markdown, pandoc won’t use them unless it’s converting that document to LaTeX. So if you want to create an MS Word document instead of a PDF, you’ll lose your citations. Personally, my solution was to write a parser in perl that could find pandoc citations and replace them with BibLaTeX citations, so I could still output LaTeX documents containing BibLaTeX formatting. But that might not be a solution for everyone. If there’s enough interest, I will polish and publish my perl code for others to use.

Monday, October 13, 2014

Arizona State University Dissertation/Thesis Template in LaTeX

Arizona State University (ASU) is one of the largest universities in the US. It must have tens of thousands of graduate students in attendance at any given time, many of whom need to write and submit a thesis or dissertation. All theses and dissertations need to following the formatting guidelines of ASU’s Graduate College (latest revision [July 2013] available here).

So I was surprised to see that ASU’s current LaTeX template is fairly basic. By writing this, I do not mean to criticize its author, who (as I understand the situation) was a graduate student who created a template that worked reasonably well and decided to make it available to others. ASU then adopted this student’s work as the template it would officially distribute to students. But as far as I know, the student who created the template did not invest a great deal of time or effort into it, and as a result, the template is rather shallow. It has the correct margins, and the table of contents will come out more or less correct, but what if you want to include appendices, for example? Or use biblatex for citations instead of natbib?

I did the formatting for my PhD dissertation on my own and created a new LaTeX template, which is available on GitHub here. (If you’re not familiar with git, you can grab everything simply by clicking the “Download ZIP” button to get the template and supporting files in a ZIP archive. Or just click here.)

Sample title page of dissertation template; click for image full sample PDF

Sample title page of dissertation template; click for image full sample PDF

The biggest (and, in my opinion, the most beneficial) difference between the official template and this new one is that the new template uses the memoir document class. The memoir document class is designed for formatting book-length works. For example, it has commands for indicating divisions between front matter, main matter, and back matter and adjusts formatting accordingly. So it’s a natural choice for formatting theses and dissertations which are book-length works. memoir is also a very large document class that natively supports many features without having to load other packages. It can natively format footnotes and endnotes, for example, and the table of contents can be highly customized using only memoir commands. The template I created definitely loads other packages, but I would guess that memoir is probably the most complete document class out there. And finally, memoir has excellent documentation, which is currently over 600 pages long. If there’s some confusing code in the template I created or if someone wants to add a new feature to their own thesis/dissertation, there’s a better chance that the documentation for memoir will provide the answer than for other document classes.

Some of the other improvements over the official template include the following:

  • Includes all required and optional sections, including a copyright page, dedication, acknowledgements, preface, endnotes, and biographical sketch.
  • Correct formatting for main matter (chapters) and back matter (appendices), which makes it easy to organize your entire document.
  • For the typesetting engine, works with either pdftex or xetex. (xetex makes it easy to use any of the approved fonts.)
  • For references, works with natbib and biblatex. (biblatex makes it easy to use Chicago, MLA, and APA style references.)
  • Better separation of content and formatting. For example, write your table captions however you want and they will appear correctly in the list of tables. This arrangement makes it much easier to produce another (much better-looking version) of your dissertation/thesis in case you want to share a better-looking version with colleagues.
  • Internal document references work. For example, clicking on an in-text citation jumps down to that citation in the references list.
  • Bookmarks work, so there is a navigation side menu in the PDF that contains the major document elements (e.g., table of contents and each chapter heading), so the PDF is easier to navigate.
  • Writes PDF metadata (including the title, name, and keywords) automatically.
  • Uses the memoir document class, so it is easier to change formatting and create a book-length work in general.

There were challenges to getting all these features working together. Strangely enough, one of the most difficult challenges was getting chapter-level and part-level headings to appear uppercased in the table of contents. It turns out that the typical commands for uppercasing text in the table of contents conflict with the hyperref package. (I’ve written a separate post on my solution here.) But overall, I think I’ve found reasonably elegant solutions for implementing the formatting requirements in ASU’s style guide.

I have intentionally not created a style file, yet. In my experience, troubleshooting a document with a custom style file leads to headaches because it requires hunting through the style file and the preamble to figure out where problems are. I think it’s better to have all the potentially problematic code in one lengthy preamble. If there is enough interest in either a style file or packaging everything in a class, I will create them, but at least initially, I am just making a simple template file available to everyone.

Formatting a dissertation or thesis is often one of the less pleasant parts of the graduate student experience. It’s the last thing students need to do before they’re finally done with an often long and difficult graduate school experience, and formatting is usually tedious and time-consuming. Hopefully, this template can take some of the pain out of that experience for ASU graduate students.

Again, the template is available on GitHub here. You can grab everything simply by clicking the “Download ZIP” button to get the template and supporting files in a ZIP archive. I did the formatting for my PhD dissertation on my own and created a new LaTeX template, which is available on GitHub here. If you’re not familiar with git, you can grab everything simply by clicking the “Download ZIP” button to get the template and supporting files in a ZIP archive, or just click here.

Monday, October 6, 2014

Uppercasing in a "memoir" Table of Contents with "hyperref"

memoir is one of my favorite document classes in LaTeX, but uppercased headings in a memoir table of contents conflict with the hyperref package. Uppercasing headings the table of contents in a memoir document can be achieved in at least two basic ways. The first uses the \cftKfont font commands to send formatting instructions to the table of contents. The K in \cftKfont stands for the heading level,1 so to modify the chapter headings, for example, you would use the \cftchapterfont macro as shown below:

% Uppercase chapter headings in TOC
\renewcommand*{\cftchapterfont}%     
  {\normalfont\MakeUppercase}

Because \MakeUppercase is sort of brutal in the way that it uppercases, it can cause errors. The memoir documentation recommends using another macro \MakeTextUppercase, which would be used in the following way to uppercase chapter headings in the table of contents:

% Uppercase chapter headings in TOC
\renewcommand*{\cftchapterfont}%     
  {\normalfont\MakeTextUppercase}

These methods are simple, but—as noted above—they both conflict with the hyperref package.

This conflict is aggravating because hyperref makes PDFs better in many ways. For example, hyperref will automatically make internal document links active. So if you have citations throughout your work, those in-text citations will turn into links that jump down to their entries in the list of citations at the end of the document. Similarly, when the document refers to a table or figure (e.g., “see table 2 for more information”), the number becomes an internal document link that will take the reader to the table or figure. And entries in the table of contents also become internal document links. So there are a lot of benefits to simply using the following line in a preamble:

\usepackage{hyperref}

With specific commands, hyperref can do a lot more, such as write out PDF metadata and use another package, called bookmarks, to create a PDF navigation bar.

The conflict between these two uppercasing commands and hyperref are a known problem, so memoir provides another method, \settocpreprocessor, for sending uppercase formatting to headings in the table of contents (see p. 158 of the memoir documentation). The code below shows how to use this macro to uppercase part and chapter headings in the table of contents:

\makeatletter
\settocpreprocessor{part}{%
    \let\tempf@rtoc\f@rtoc%
    \def\f@rtoc{%
      \texorpdfstring{\MakeTextUppercase{%
        \tempf@rtoc}%
      }{\tempf@rtoc}%
    }% 
}
\settocpreprocessor{chapter}{%
    \let\tempf@rtoc\f@rtoc% 
    \def\f@rtoc{%
      \texorpdfstring{\MakeTextUppercase{%
        \tempf@rtoc}%
      }{\tempf@rtoc}%
    }% 
}
\makeatother

This code will uppercase part and chapter heading entries in the table of contents, but it does not uppercase part-level or chapter-level headings in the table of contents. The difference is important because entries, such as the “List of tables,” won’t be uppercased by these commands. Doesn’t it make more sense for entries with the same level to have the same formatting? Users can get around this problem by manually uppercasing table-of-contents entries, such as for the name of the list of tables:

% Manually uppercase name of the list of tables
\renewcommand{\listtablename}
  {LIST OF TABLES}

But users would need to go back and manually change this and other commands if they want to produce a new version of their document without uppercased part-level and chapter-level headings. This approach violates the supposed strength of LaTeX—separating content from formatting. Instead, users need a way to uppercase part-level headings and chapter-level headings in memoir documents.

Fortunately, I have discovered a way to produce uppercase part-level and chapter-level headings by patching certain macros in the memoir document class. In the preamble, enter the following:

% Load 'etoolbox' to get the '\patchcmd' macro
\usepackage{etoolbox}

% Patch the command that writes part-level entries
%   to the table of contents, so they are in 
%   'normalfont' and uppercase
\makeatletter% 
\patchcmd{\l@part}%                     
    {\cftpartfont {#1}}%                
    {\normalfont \texorpdfstring{%      
      \uppercase{#1}}{{#1}} }%
    {\typeout{Success: Patch %
      'l@part' to uppercase %
      part-level headings in the %
      table of contents.}}%
    {\typeout{Fail: Patch %
      'l@part' to uppercase % 
      part-level headings in the %
      table of contents.}}%
\makeatother%

% Patch the command that writes chapter-level 
%   entries to the table of contents, so they are 
%   in 'normalfont' and uppercase
\makeatletter% 
\patchcmd{\l@chapapp}%                  
    {\cftchapterfont {#1}}%             
    {\normalfont \texorpdfstring{%      
      \uppercase{#1}}{{#1}} }%
    {\typeout{Success: Patch %
      'l@chapapp' to uppercase %
      part-level headings in the %
      table of contents.}}%
    {\typeout{Fail: Patch %
      'l@chapapp' to uppercase %
      part-level headings in the %
      table of contents.}}%
\makeatother%

% Load 'hyperref' to get the \texorpdfstring
%   command
\usepackage{hyperref}

This solution uses a similar approach to the one provided in memoir documentation with the \settocpreprocessor macro in that both use the \texorpdfstring macro from the hyperref package. The \texorpdfstring macro lets users send different text depending on whether the text will be typeset by LaTeX or not. Chapter headings are an example of content that gets typeset by LaTeX in the table of contents and in the PDF as PDF bookmarks, so \texorpdfstring is a great way to avoid these kinds of errors.


Notes:


  1. See table 9.3 on p. 150 of the memoir documentation for the other values of K.

Sunday, July 22, 2012

Using All Kinds of Fonts in LaTeX (Part 1)

The reason I like using LaTeX is because of the beautiful PDFs that it produces. However, for a typesetting system that intends to be everything a typographer wants it to be, it’s surprisingly difficult to use a variety of fonts and maintain strong control over the typography.

The LaTeX Font Catalogue is a great resource for fonts. All of the fonts listed there can be used by including packages in the preamble, so they’re very easy to use. However, the selection is a bit limited. Arial is available as a close approximation to the ubiquitous Helvetica. But Arial/Helvetica is relatively thick. What if you want to use Helvetic Neue Light or Ultralight? As far as I know, I can’t use a package for either of those. And if I’m serious about document design, I’ll want access to display fonts for titles and headings, too. Unfortunately, the LaTeX font catalogue has a very poor selection of display fonts.

The next easiest way to get non-standard fonts into a document is to use XeTeX/XeLaTeX as the typesetting engine and the fontspec package. fontspec lets you insert system fonts into your documents (and unfortunately, it’s not compatible with pdfLaTeX). On a Mac, you can find your system fonts by looking into the Font Book application or by entering the following command in the Terminal:

fc-list

Because you’re going to get a lot of text in the output, you may want to pipe the results into a text file. You can use the following command to pipe the output into a plain-text file called fontlist.txt:

fc-list > fontlist.txt

Any font that is listed in the output can be included in a LaTeX document with the following code (in the document preamble). The example uses Helvetica Neue Light.

\usepackage{fontspec}         % Provide features for AAT
                              %   and OpenType fonts
\setmainfont{Helvetica Light} % Define the default font 
                              %   family

You can also set a sans-serif font and a monospaced font. For example, to set Helvetica Neue Light as the document’s sans-serif font, enter the following in the preamble of your LaTeX document:

\setsansfont{Helvetica Light} % Make Helvetica Neue Light 
                              %   the default font for 
                              %   sans-serif text. 

And if you want to use a new font, you can simply install it on your system. For Mac OSX, at least, installing fonts is incredibly easy. You can go to a variety of websites to find and download new fonts to your computer.1 Usually, they’re in ZIP files, so you simply unzip the file and then double click on the font file (e.g., a *.ttf file or an *.otf file). It will open in Font Book, and then you can click the “Install Font” button to install it (Source). It’s pretty straightforward.

So it seems like this solves the problem of getting different fonts into a LaTeX document, right? Well, yes, but there’s an important disadvantage to using XeLaTeX to typeset your document: you can’t use any of the features in the microtype package. The microtype package makes available several functions that typographers need, such as control over kerning and inter-word spacing. (I recommend to play around with inter-word spacing, for example, to quickly see how it changes the “feel” of the document.) microtype also does protrusion, which makes documents’ margins appear straight to the human eye. And because all things are complicated in LaTeX, microtype is not compatible with XeLaTeX.

So if you want access to all the features in microtype, you need to use pdfLaTeX as the typesetting engine, and in this case, you can’t use the fontspec package to include system fonts in the document. As a result, you need to install new fonts in your TeX system. I’ll explain how to do that in the next blog post.


Notes:


  1. I haven’t gotten that far into fonts, but Font Squirrel seems like a good resource. Keep in mind that fonts can be free for personal use but not for commercial use. Personally, I would prefer to use only unrestricted, free fonts, because some of my work is commercial, and I’m not sure what I’ll be using my personal work for in the future.

Tuesday, July 10, 2012

LaTeX Isn't Just for Technical Documents

LaTeX is not just for peer-reviewed math journals. You can use LaTeX to make PDFs that rival those made with expensive software, like the Adobe line of products. You have to get the basics down first, but then it’s just a matter of picking and using good fonts, colors, and layouts.

So here is one of my examples to prove this point. It’s a quick reference sheet for cooking. Although the content isn’t actually my own, the formatting and layout are. In a reddit post on the LaTeX sub-reddit, user a_contact_juggler challenged the sub-reddit to make a LaTeX version of user Fredthecoolfish’s single-sided, handwritten cooking cheat sheet. The original handwritten cheat sheet is available here.

You can see other people’s LaTeX versions in this reddit post. My version is below. Click on it to download the PDF.

Cooking cheat sheet

Cooking cheat sheet

And here are the raw files:

There isn’t much education in this post. The main point is to show that it’s possible to make colorful, attractive PDFs for a general audience—not just engineers and math enthusiasts—with LaTeX.

Although the main point here isn’t to describe new functions in LaTeX, I did make an interesting discovery while working on this: it’s possible to turn images into links with the standard \href command. For example, enter the following in a LaTeX document:

\href{URL}{%
  \includegraphics[height=18pt,keepaspectratio]%
  {IMAGE_FILENAME}%
}

Replace URL with the URL you want to link to, and replace IMAGE_FILENAME with file name of the image.

Sunday, July 1, 2012

Using In-text Fonts in LaTeX Figures

One thing that has always frustrated me about putting figures into Word documents is the mismatch between the fonts in the figures and the running text.

Drafting images in Word is a hassle, and it’s hard to retain their look when exporting them to other programs. More importantly, figures drafted in Word don’t look very sharp or professional. If you create the image in another program like MS Visio and then import it, the font size rarely matches the font size in the rest of the document. And if you change the font type in the Word document, you have to go back and rework all the text in the Visio file. Some of line breaks may have changed because the new font is larger or smaller. You may need to change the alignment and indents to get the right look. And so on. It’s a time-consuming nuisance.

And if you use another program that only gives you an image file as output (e.g., a PDF or JPEG), it’s even harder to match the font size in the rest of the document. All you can do is resize the entire image until the font looks about the same as the surrounding text. If the figure is larger than the margins of the document when the font sizes finally match, you’ll have to redraft it and try again.

If you’re using LaTeX, there’s an elegant solution to this problem, and it’s one of the reasons I love preparing documents in LaTeX: You can use Inkscape to prepare the figure and then save a copy as a PDF with this option checked: “PDF+LaTeX: Omit Text in PDF, and create LaTeX file”.

First, you need to draft the figure in Inkscape. I won’t go into details about how to use Inkscape except to provide the following tip on horizontally and vertically centering text in a box or rectangle:

  1. Create a shape like a box or rectangle and create a text object.
  2. Open the “Align and distribute...” dialogue in the “Object” menu (or press Shift+Ctrl+A).
  3. Choose a “Relative to:” mode that refers to an object. I prefer “Biggest object” as the text object ought to be smaller than the shape it appears in.
  4. Select both the text and the shape.
  5. Press both horizontal and vertical align buttons in the “Align and distribute...” dialogue.

Now the text will appear vertically and horizontally centered in the shape. The following image is an example I created for illustrative purposes.

Example figure

Example figure

After drafting the figure in Inkscape:

  1. Go to the “File” menu, select “Save a copy...
  2. Choose “Portable Document Format (*.pdf)”.
  3. On the following screen, check the box for “PDF+LaTeX: Omit text in PDF, and create LaTeX file”. (See the figure below.)

Inkscape Screenshot: File -> Save a copy... -> Portable Document Format (*.pdf)

Inkscape Screenshot: File -> Save a copy... -> Portable Document Format (*.pdf)

Inkscape separates the graphics and the text into two files: One is a PDF with the graphics and the other is a LaTeX file (with a *.pdf_tex extension) which contains instructions for placing the text and the text itself.

To insert this image into the LaTeX file, simply insert the following code in your LaTeX document:

\input{filename.pdf_tex}

You can set the width of the figure with \svgwidth, and LaTeX will maintain the aspect ratio from the original image. For example, insert the following code in your LaTeX document:

\def\svgwidth{\textwidth}

So the full code for inserting a figure like this could be the following:

\begin{figure}
  \centering
  \def\svgwidth{\textwidth}
  \input{filename.pdf_tex}
  \caption{the caption}
  \label{fig:thelabel}
\end{figure}

And you need to include the pstricks package in the preamble of your LaTeX document with the following code:

\include{pstricks}

(Note that you don’t need the graphicx package.)

There are a lot of ways to get figures into LaTeX documents, but I like this approach and promote it here, because it’s relatively easy. With other approaches like PGF/TikZ, PSTricks and Xfig, you need to either learn an extensive new set of commands and directly code the figure or use graphical user interfaces (GUIs) that will output code for you. But in my experience, GUI programs for those packages tend to be awkward and not highly developed, so you’re left with the unappealing and time-consuming chore of digging through instruction manuals to learn how to code your figure. Inkscape, on the other hand, is relatively easy to use, and if you don’t want to draft the figure in Inkscape directly, you can import it from another program. I have imported PDFs from Visio into Inkscape, for example, touched them up and then exported them as described above.

Here are some example files to see this approach in action. Put all three in the same directory and then typeset the *.tex file for the document using XeLaTeX.

Here is the output, and here is the SVG file for the image in case you want to play around with it in Inkscape.

And finally I have to admit that this is a post with poor attribution. I learned about this approach several months ago, and the sources that I used to figure it out have since been lost to my foggy memory. If people have sources and post them in the comments, that would be welcome and appreciated.