2456 words
12 minutes
Bridging Code and Document: Insights into LaTeX + Python Workflows

Bridging Code and Document: Insights into LaTeX + Python Workflows#

LaTeX is renowned for its unparalleled typesetting quality and the professional appearance it imparts to documents—particularly in academic, scientific, and technical fields. Python, on the other hand, is a general-purpose programming language beloved by researchers, data scientists, developers, and hobbyists alike. As the complexity of your projects grows, you may inevitably find yourself needing the power of both: robust computational workflows (Python) and world-class document preparation (LaTeX).

This article aims to guide you through combining Python and LaTeX, starting from the fundamentals, gradually moving toward intermediate approaches, and culminating with advanced integrations and automation. Whether you are a student writing your first paper with embedded code examples or a seasoned developer seeking to automate large-scale documentation, these insights will help you unify your code and typeset environment.


Table of Contents#

  1. Introduction to LaTeX and Python
  2. Getting Started with LaTeX Basics
  3. Python in Document Preparation
  4. Essential Tools and Libraries
  5. Setting Up Your Environment
  6. Working with Minted for Syntax Highlighting
  7. Leveraging PythonTeX for Dynamic Documents
  8. Automating Compiling and Reporting
  9. Advanced Conversion and Integration Techniques
  10. Practical Example: Building a Data-Analyzing Document
  11. Professional-Level Expansions
  12. Conclusion

Introduction to LaTeX and Python#

Why Merge LaTeX and Python?#

  • Reproducible Research: Scientists and researchers strive to keep computations reproducible. If the raw results, plots, and tables in a paper are all generated by Python, embedding that directly in LaTeX ensures your final paper is always in sync with the latest data.
  • Automation: If you have to generate multiple documents—like identical reports for different clients—combining Python scripts with a LaTeX template can automate this repetitive process.
  • Professional Appearance: LaTeX excels at neatly typesetting equations, references, images, and tables in a uniform and high-quality format. Automating figure generation via Python deepens your productivity.

Example Use Cases#

  • Academic Theses or Research Papers with integrated statistical analysis, simulations, or machine learning insights.
  • Technical documentation that needs up-to-date code snippets and outputs.
  • Corporate reporting that leverages data insights, with mass-generation of customized PDF reports.

Throughout the rest of this blog post, you will see several references and examples that make it easier to wield LaTeX and Python effectively.


Getting Started with LaTeX Basics#

LaTeX is not just another word processor. It uses a markup language to define how content is structured and displayed. Understanding some crucial concepts will save you time and confusion:

  1. Document Classes: Common classes include article, report, book, etc. Each class has different layouts and features.
  2. Packages: Extend functionality. For instance, graphicx handles images, amsmath helps with advanced math, biblatex or natbib supports bibliography, and so forth.
  3. Environments: Structures like \begin{equation} ... \end{equation}, \begin{itemize} ... \end{itemize}, or \begin{table} ... \end{table} help define sections of the text with specialized formatting.
  4. Compiling: You write .tex files in plain text, then use a LaTeX compiler (like pdflatex, xelatex, or lualatex) to generate a final PDF.

Example of a Minimal LaTeX Document#

\documentclass{article}
\usepackage[utf8]{inputenc} % Support for UTF-8 encoding
\usepackage{amsmath} % Math enhancements
\usepackage{graphicx} % Handling images
\begin{document}
\title{My Awesome Document}
\author{Jane Doe}
\date{\today}
\maketitle
\section{Introduction}
This is a minimal LaTeX document.
\section{Mathematical Example}
We can write an equation as:
\[
E = mc^2
\]
\end{document}

Compiling the above with pdflatex produces a simple PDF. Once you gain comfort with these basics, you can start looking into bridging Python outputs into this workflow.


Python in Document Preparation#

Python is a flexible language for data analysis, automation, app development, and more. When you bring Python into your LaTeX workflow, the key questions are:

  • How do we include code snippets in LaTeX?
  • How can we dynamically generate results (tables, plots, numbers) from Python code in our LaTeX file?
  • Can we automate the entire process so that the final PDF is always updated with new data?

Inclusion of Code Snippets#

You can simply copy and paste your Python code into a verbatim environment in LaTeX:

\begin{verbatim}
def fibonacci(n):
a, b = 0, 1
for i in range(n):
print(a)
a, b = b, a + b
\end{verbatim}

However, verbatim lacks syntax highlighting. There are advanced approaches—such as the minted or listings packages—that enable syntax highlighting and line numbering.

Generating Tables and Figures Dynamically#

Using Python, you might generate a CSV file or a set of values that feed into a LaTeX table. Manual copy-pasting is prone to error, so you can either generate the entire LaTeX code that includes the table or directly embed code that LaTeX can call upon during compilation (e.g., with PythonTeX). The best approach depends on your project’s complexity.


Essential Tools and Libraries#

Before diving into the actual workflows, let’s highlight several important tools:

NamePurposeUsage Level
PythonTeXExecutes Python (and other languages) from within LaTeX, then integrates results directly into the PDFAdvanced
mintedProvides syntax highlighting for code examplesIntermediate
listingsAnother syntax highlighting package for code listingsIntermediate
PweaveSimilar to Sweave in R; allows weaving Python code and LaTeX text into one fileIntermediate
PyLaTeXA Python library to generate LaTeX code from Python scripts automaticallyAdvanced
nbconvertConverts Jupyter Notebooks into LaTeX (among other formats)Intermediate
pandocUniversal document converter that supports reStructuredText, Markdown, LaTeX, and moreIntermediate

Depending on your workflow, you may choose one or more of these tools. For instance, minted is perfect for highlighting code in a final PDF, while PythonTeX gives you the power to run Python code during LaTeX compilation, automatically capturing the output.


Setting Up Your Environment#

Installing LaTeX#

To use LaTeX effectively, you typically install a TeX distribution:

  • TeX Live (cross-platform, often the go-to for Linux users)
  • MiKTeX (common on Windows, also available on other platforms)
  • MacTeX (for macOS)

Once installed, you’ll have a suite of commands (pdflatex, xelatex, etc.). You can also work in an integrated editor like TeXstudio, Overleaf (online), or Visual Studio Code with LaTeX Workshop extension.

Installing Python and Required Packages#

  1. Python 3.x is recommended for most current applications.

  2. For data analysis or advanced workflows, a typical environment might include:

    • numpy
    • matplotlib
    • pandas
    • sympy (for symbolic math)
  3. Additional Python libraries or tools for bridging with LaTeX might include:

    • pip install pythontex
    • pip install pylatex
    • pip install pweave
    • pip install nbconvert

Ensuring your environment is consistent is vital. If you rely on advanced features, you might consider a virtual environment (using venv or conda) so package versions remain stable across your projects.


Working with Minted for Syntax Highlighting#

Why Minted?#

minted is a LaTeX package that leverages the Pygments syntax highlighter, written in Python, to colorize code. It offers an excellent range of languages (including Python) and customization.

Basic Usage#

You must install python3-pygments or an equivalent library on your system so that minted can call pygmentize. A minimal example:

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{minted}
\begin{document}
\section{Code Snippet Example}
Here is a Python function using \texttt{minted}:
\begin{minted}{python}
def greet(name):
return f"Hello, {name}!"
\end{minted}
\end{document}

To compile this, you generally need to run something like:

pdflatex -shell-escape main.tex

The -shell-escape flag allows LaTeX to call external commands (e.g., pygmentize). If you forget the -shell-escape flag, you’ll see compilation errors indicating minted could not run the necessary external processes.

Customizing Output#

minted provides various options for line numbering, style, frames, or background colors. For instance:

\usepackage{minted}
\usemintedstyle{friendly}

You can change friendly to other styles like monokai, default, emacs, or many others. Also, you can add options in the environment itself:

\begin{minted}[fontsize=\small, linenos]{python}
def fibonacci(n):
"""Print the Fibonacci sequence up to n."""
a, b = 0, 1
while a < n:
print(a, end=' ')
a, b = b, a + b
\end{minted}

This snippet adds line numbers (linenos) and makes the font smaller (\small).


Leveraging PythonTeX for Dynamic Documents#

If you want code execution during LaTeX compilation—e.g., to automatically insert Python results (numbers, figures, or text) in your document—PythonTeX stands out as a powerful option.

Installation and Basic Configuration#

Install PythonTeX with:

pip install pythontex

Then in your LaTeX file:

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{pythontex}
\begin{document}
\section{Using PythonTeX}
Let's compute something in Python and display the result here.
\begin{pycode}
# This Python code is run at compilation time
x = 42
\end{pycode}
The value of x is: \py{ x }
\end{document}

To compile:

  1. Run pdflatex -shell-escape myfile.tex (generates an auxiliary script)
  2. Run pythontex myfile.tex (executes Python portions and stores them)
  3. Run pdflatex -shell-escape myfile.tex again (integrates updates into the final PDF)

This might vary depending on your editor—some can automate these steps for you.

PythonTeX Environments#

PythonTeX offers different environments:

  • \begin{pyblock} ... \end{pyblock} for inline code that doesn’t produce textual output in the final PDF.
  • \begin{pycode} ... \end{pycode} for code you want to run but not necessarily display.
  • \py{...} for inline expressions.
  • \begin{pyconsole} ... \end{pyconsole} to show both commands and outputs in a console-style format.

Inserting Plots#

You can generate plots with matplotlib or other libraries, save them to a file, and then embed them in the document. For example:

\begin{pycode}
import matplotlib.pyplot as plt
plt.figure()
plt.plot([1, 2, 3, 4], [1, 4, 2, 3], 'ro-')
plt.title('Simple Plot')
plt.savefig('myplot.png')
plt.close()
\end{pycode}
\begin{figure}[ht]
\centering
\includegraphics[width=0.5\textwidth]{myplot.png}
\caption{A Plot Generated by PythonTeX}
\end{figure}

You include the plot just like any normal LaTeX figure. Everything is automated, provided you re-run the appropriate commands.


Automating Compiling and Reporting#

Shell Scripting#

If you frequently re-generate reports, you can create a shell script (on Linux or macOS) or a batch file (on Windows). Example for Linux:

#!/bin/bash
pdflatex -shell-escape myreport.tex
pythontex myreport.tex
pdflatex -shell-escape myreport.tex

If you need to pass specific arguments or environment variables (e.g., to run a different Python environment), just layer that into the script. This approach helps ensure you or your colleagues only need to run a single command to produce updated PDFs.

Makefile Approach#

When building more complex projects, a Makefile can handle dependencies. The advantage is that make checks timestamps of files and only recompiles if needed:

all: myreport.pdf
myreport.pdf: myreport.tex
pdflatex -shell-escape myreport.tex
pythontex myreport.tex
pdflatex -shell-escape myreport.tex
clean:
rm -f *.aux *.log *.out *.pyg *.pytxcode *.pytxpyg *.pdf

Typing make in your terminal triggers the build, and make clean removes extraneous files.


Advanced Conversion and Integration Techniques#

Pweave#

Pweave is similar to Sweave (in R) or knitr, letting you weave Python code into document files. A Pweave file typically uses a mix of Python code blocks and markup. When processed, it executes the code and inserts results into your final document.

For instance, a file myreport.pmd might include:

Some text about data.
<<echo=False, results='verbatim'>>=
import pandas as pd
df = pd.DataFrame({"A": [1,2,3], "B": [4,5,6]})
print(df)
@
LaTeX instructions, math, etc.
<<echo=False, fig=True>>=
import matplotlib.pyplot as plt
plt.plot([1,2,3],[4,5,6])
plt.savefig('test.png')
@
\includegraphics{test.png}

You run pweave myreport.pmd -f tex to generate a LaTeX file that includes both code outputs and references to figures. Then compile that LaTeX file with pdflatex.

Jupyter Notebooks with nbconvert#

If your workflow is heavily reliant on Jupyter Notebooks, you can convert .ipynb files into LaTeX directly:

jupyter nbconvert --to latex mynotebook.ipynb

You can then compile the resulting .tex file. This approach helps if you do a lot of interactive exploration but still desire a professionally typeset final output.

pandoc One-Stop Conversions#

pandoc is a Swiss Army knife for converting between document formats. Suppose your source is Markdown with embedded code blocks. With pandoc, you can:

pandoc myfile.md -o myfile.pdf --pdf-engine=xelatex

Pandoc can read code fences in Markdown, highlight them with Python’s pygmentize, and produce a PDF with syntax highlighting (assuming correct configuration). If you want more sophisticated code execution, you’ll need tools like pandoc-crossref or rely on embedded scripts.


Practical Example: Building a Data-Analyzing Document#

To illustrate a typical scenario, let’s design a short pipeline for a fictional data analysis. Suppose you have a dataset containing user activity logs, and you want to generate a daily summary PDF.

Step 1: Python Data Prep#

You might have a Python script (analysis.py) that:

  1. Reads a CSV file (logs.csv).
  2. Performs some aggregation or analysis.
  3. Outputs results and maybe saves a figure (activity_plot.png).

Example code:

import pandas as pd
import matplotlib.pyplot as plt
def analyze_data(input_csv):
df = pd.read_csv(input_csv, parse_dates=['timestamp'])
daily_counts = df.groupby(df['timestamp'].dt.date).size()
# Save basic stats
summary = {
"total_records": len(df),
"unique_users": df['user_id'].nunique(),
"days_covered": daily_counts.index[-1] - daily_counts.index[0]
}
# Plot daily activity
plt.figure(figsize=(8,4))
daily_counts.plot(kind='bar')
plt.title("Daily Activity Counts")
plt.xlabel("Date")
plt.ylabel("Activity Count")
plt.tight_layout()
plt.savefig("activity_plot.png")
plt.close()
return summary, daily_counts
if __name__ == "__main__":
summary, daily_counts = analyze_data("logs.csv")
print(summary)
print(daily_counts)

When this script is run, it produces a dictionary summary plus a bar plot of daily counts.

Step 2: LaTeX Report with PythonTeX#

Create a LaTeX file (report.tex):

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage{pythontex}
\begin{document}
\title{Daily Activity Report}
\author{Data Analytics Team}
\date{\today}
\maketitle
\section{Introduction}
This report summarizes user activity from our platform logs.
\begin{pycode}
# Insert your analysis code or import from a module
import analysis
summary, daily_counts = analysis.analyze_data("logs.csv")
\end{pycode}
\section{Key Results}
\begin{itemize}
\item Total Records: \py{summary["total_records"]}
\item Unique Users: \py{summary["unique_users"]}
\item Days Covered: \py{summary["days_covered"]}
\end{itemize}
\section{Activity Plot}
\begin{figure}[h!]
\centering
\includegraphics[width=0.7\textwidth]{activity_plot.png}
\caption{Daily Activity Counts}
\end{figure}
\section{Detailed Counts}
\pyc{for date_val, count_val in daily_counts.items():}
Date: \py{date_val}, Count: \py{count_val} \\
\pyc{endfor}
\end{document}

Compile commands:

  1. pdflatex -shell-escape report.tex
  2. pythontex report.tex
  3. pdflatex -shell-escape report.tex

Voila! A single PDF is generated with the aggregated stats, a bar chart, and a table of daily counts.

Step 3: Autogenerate New Reports#

If you receive a new logs CSV every day (e.g., logs_2023-10-09.csv), you could script:

#!/bin/bash
cp logs_2023-10-09.csv logs.csv
pdflatex -shell-escape report.tex
pythontex report.tex
pdflatex -shell-escape report.tex
mv report.pdf report_2023-10-09.pdf

Now you have a daily pipeline generating custom PDFs.


Professional-Level Expansions#

Continuous Integration (CI)#

As your documentation needs grow, you might adopt a CI service (like GitHub Actions, GitLab CI, or Jenkins) to automatically build and test your LaTeX + Python content whenever changes are committed. This helps teams maintain consistent, up-to-date artifacts.

Large Multi-File Projects#

For substantial projects (e.g., a 200-page dissertation with experimental results), you can break your .tex sources into smaller files. You can also maintain separate Python scripts or notebooks. Good organization and naming schemes are essential. You might have:

dissertation/
chap1_introduction.tex
chap2_background.tex
chap3_methods.tex
chap4_results.tex
chap5_discussion.tex
main.tex
scripts/
analysis.py
generate_plots.py

LaTeX references these chapters with \input{chapX_filename.tex} or \include{chapX_filename.tex}. Use make or a script to orchestrate the build.

Dockerization#

If you want maximum reproducibility across different operating systems, you can Dockerize your environment. For instance, you create a Dockerfile that installs TeX Live, Python, your required Python packages, and any additional tools (e.g., pythontex). Then, your entire team or a CI runner can consistently produce the same PDF without “but it works on my machine�?woes.

Handling Special Fonts and Unicode#

Some projects require custom fonts, especially for brand styling or foreign characters. Using xelatex or lualatex simplifies using system fonts and advanced OpenType features. This is handy if your project involves multiple languages or you need a perfect match to corporate brand guidelines.

Macro-Driven Automation#

LaTeX macros can drastically reduce repeated code. For example, if you’re referencing the same dataset name or parameter in multiple sections, you can store that in a LaTeX macro or use PythonTeX variables. This approach ensures consistency. In large documents, you only update the variable in one place for it to reflect throughout.


Conclusion#

Bridging LaTeX and Python is powerful not just for academic or scientific papers but for any scenario requiring precise, automated, and consistent documentation. We began with core LaTeX concepts, then explored how Python can enhance document creation—through syntax highlighting (minted), code execution within LaTeX (PythonTeX), or external weaving (Pweave). We tackled automation via shell scripts, make, or Docker, and examined advanced techniques including converting Jupyter notebooks or leveraging continuous integration to maintain reproducible document builds.

Key takeaways:

  1. A general-purpose language like Python offers flexibility and computational depth for your documents.
  2. LaTeX remains unmatched in typesetting quality and advanced document structures.
  3. By integrating these two, you can produce dynamic, reproducible, and professional documents.
  4. Tools like minted, PythonTeX, and pweave bring different levels of convenience and power—choose based on your project’s scope and needs.

As you gain familiarity with these workflows, you can move from simple code listings to deeply integrated data analysis, even building enterprise-scale document automation pipelines. Whether you’re writing a short assignment or publishing an entire book, Python + LaTeX will provide a stable, elegant, and efficient workflow.

Bridging Code and Document: Insights into LaTeX + Python Workflows
https://science-ai-hub.vercel.app/posts/554148ea-2bb8-45e3-91e5-ef2aa37c755f/3/
Author
Science AI Hub
Published at
2025-05-27
License
CC BY-NC-SA 4.0