2024-02-22 10:11:07 +11:00
|
|
|
# Paychex payroll importer for Beancount
|
|
|
|
|
2025-02-20 14:01:05 +11:00
|
|
|
This tool automates the monthly task of transcribing a payroll CSV from service
|
|
|
|
provider Paychex into around 300 lines of fairly intricate Beancount bookkeeping
|
|
|
|
data.
|
2024-02-22 14:06:45 +11:00
|
|
|
|
2024-03-15 14:06:54 +11:00
|
|
|
|
2024-02-22 10:11:07 +11:00
|
|
|
## Usage
|
|
|
|
|
2025-02-20 14:01:05 +11:00
|
|
|
The run, the program requires a Java runtime (tested with OpenJDK 17 and 21).
|
|
|
|
On Debian 12 (Bookworm):
|
|
|
|
|
|
|
|
sudo apt install openjdk-17-jre
|
|
|
|
|
|
|
|
Run a demo with two example employees, Jack and Jill Citizen:
|
2024-02-22 10:11:07 +11:00
|
|
|
|
2024-03-15 17:17:41 +11:00
|
|
|
java -jar import-x.x.x-standalone.jar --demo
|
2024-02-22 10:11:07 +11:00
|
|
|
|
2024-03-15 17:17:41 +11:00
|
|
|
Provide your own payroll data with:
|
2024-02-22 10:11:07 +11:00
|
|
|
|
2024-03-15 17:17:41 +11:00
|
|
|
java -jar import-x.x.x-standalone.jar --csv resources/example-paychex-pay-item-details.csv --total-fees 206.50
|
|
|
|
|
|
|
|
In the above, various values such as the date, time period covered and
|
2025-02-20 14:01:05 +11:00
|
|
|
receipt/invoice values show "TODO" placeholders that you are expected to fill in
|
|
|
|
later. If you prefer, you can provide any/all of these explicitly:
|
2024-03-15 17:17:41 +11:00
|
|
|
|
|
|
|
java -jar import-x.x.x-standalone.jar --csv resources/example-paychex-pay-item-details.csv --date 2023-12-29 --period 'December 2023' --total-fees 206.50 --pay-receipt-no rt:19462/674660 --pay-invoice-no rt:19403/675431 --fees-receipt-no rt:19459/675387 --fees-invoice-no rt:19459/674887 --retirement-receipt-no rt:19403/676724 --retirement-invoice-no rt:19403/675431
|
2024-02-22 14:29:50 +11:00
|
|
|
|
|
|
|
You can test the output in Beancount by adding the following header entries to define the accounts:
|
|
|
|
|
2025-05-16 09:16:28 +10:00
|
|
|
2023-01-01 open Assets:Citizens:Check1273
|
2024-02-22 14:29:50 +11:00
|
|
|
2023-01-01 open Expenses:Payroll:Salary
|
|
|
|
2023-01-01 open Expenses:Payroll:Fees
|
|
|
|
2023-01-01 open Expenses:Payroll:Tax
|
|
|
|
2023-01-01 open Expenses:Hosting
|
|
|
|
2023-01-01 open Expenses:Insurance
|
|
|
|
2023-01-01 open Liabilities:Payable:Accounts
|
|
|
|
|
|
|
|
Then run Beancount with:
|
|
|
|
|
|
|
|
bean-report [your file] balances
|
2024-02-22 10:11:07 +11:00
|
|
|
|
|
|
|
|
2025-02-20 14:01:05 +11:00
|
|
|
## Building
|
|
|
|
|
|
|
|
To build you need a Java SDK (tested with OpenJDK 17 and 21) and Clojure. On
|
|
|
|
Debian 12 (Bookworm):
|
|
|
|
|
|
|
|
sudo apt install openjdk-17-jdk clojure
|
|
|
|
|
|
|
|
To build, run:
|
|
|
|
|
|
|
|
bin/build
|
|
|
|
|
|
|
|
This will output a JAR file like `target/import-x.x.x-standalone.jar`, where the
|
|
|
|
version number is based on the Git revision.
|
|
|
|
|
|
|
|
|
2024-02-22 10:11:07 +11:00
|
|
|
## Development
|
|
|
|
|
2024-02-22 14:06:45 +11:00
|
|
|
Run tests with:
|
|
|
|
|
2024-03-15 14:06:54 +11:00
|
|
|
bin/test
|
2024-02-22 14:06:45 +11:00
|
|
|
|
2024-02-22 12:43:32 +11:00
|
|
|
You can run without building using:
|
|
|
|
|
2024-03-15 14:06:54 +11:00
|
|
|
bin/dev --csv resources/example-paychex-pay-item-details.csv --total-fees 206.50
|
|
|
|
|
|
|
|
The project is set up for development in Emacs and CIDER-mode. Open a source
|
|
|
|
file and run `cider-jack-in`.
|
2024-02-22 12:43:32 +11:00
|
|
|
|
2024-02-22 10:11:07 +11:00
|
|
|
|
2025-02-20 14:01:05 +11:00
|
|
|
## Why Clojure?
|
2024-03-15 14:06:54 +11:00
|
|
|
|
2025-02-20 14:01:05 +11:00
|
|
|
Clojure is very well-suited to this kind of data
|
|
|
|
transformation/manipulation. Even without fully understanding the implications
|
|
|
|
of all the accounting/tax concepts I was able to parse the previous
|
|
|
|
human-generated Beancount entries into a data structure and work backwards to
|
|
|
|
produce results that matched exactly, without worrying about exact text
|
|
|
|
formatting. Libraries like deep-diff2 and the editor-integrated REPL made it
|
|
|
|
very efficient to compare the data structures and build the program - certainly
|
|
|
|
much more efficient than developing the same program in Python. It's only just
|
|
|
|
over 500 lines of code and we've had zero bugs as of writing (one year since it
|
|
|
|
was built).
|