Nathan Grigg

The DMV

I had the “pleasure” of visiting the DMV this week to apply for a driving permit for my oldest kid. By a miraculous sequence of events, we got the permit in a single visit, but it was a close call.

Because the permit will eventually turn into a driver’s license and therefore a REAL ID, the application required two different documents to provide proof of address. There is a long list of valid documents, such as utility bills and property tax bills, and furthermore the DMV recognizes that not everyone living at an address receives such bills:

What if I do not have one of the above residency documents?

You can use a relative’s (parent, child, spouse/domestic partner) residency document if you live at the same address and provide a document (such as a birth or marriage certifcate [sic]) that shows that relationship.

It all seems reasonable enough, but the rules are implemented like a poorly-written computer program.

The question is, can I use my driver’s license (together with a birth certificate) as proof of my teen’s residency? In theory, this should count as definitive proof of address, since they required me to show two address documents in order to receive the license in the first place. At the very least, it should count as one of the two factors, at least as valid as a SoCalGas bill that anyone with a basic PDF editor could easily doctor.

In practice, as you have probably guessed, it counts as nothing. Why? Because the main list of documents is written assuming that they are in the name of the applicant, and this “relative’s residency document” special case is tacked on at the end. And of course, it would be silly to say that you could use your current REAL ID as proof of address to get a REAL ID, so thus you cannot use a relative’s REAL ID as proof of address to get your REAL ID.

Being a paranoid person, I brought two documents in addition to my driver’s license, but even that was almost not enough. See, my address can be written as either 221B Baker St or 221 Baker St #B. The two bills that I brought didn’t match, which (1) was apparently a problem and (2) my driver’s license wasn’t going to get me out of it. The only thing that saved me (this is the miraculous part) was that one of the two bills had the address written both ways.

(For completeness, two other miracles. One, that my kid passed the ridiculous written exam on the first try. A test that did have a question about NEVs without explaining the acronym, and is known for questions like “In which of these locations is it illegal to park? (a) blocking an unmarked crosswalk (b) in a bicycle lane or (c) within three feet of a driveway.” The answer is (a). Nobody knows why. The second miracle is that my teen even got to take the test in the first place, because the DMV shut down the testing center at 4:30 on the dot, sending away everyone who was in line at the time. Credit for this miracle goes to the employee who processed our application, because she shut down her station and went over to the photo station to clear out the queue, getting us through and into the testing center with less than a minute to spare. At the time, we had no idea that we were up against a clock, but I’m pretty sure that she knew and intervened.)

Anyway, now it is time for 50 hours of supervised (by me) driving practice. Wish us luck!


Scan2email

I have a 5-year-old Brother Scanner (the ADS-1700W) that serves me pretty well. It is small enough to keep on my desk, fast, and has a Wi-Fi connection. But getting scans from the scanner to the computer is sometimes bothersome. My primary way of using it is to scan to a network folder on my home server, which is great for archiving things, but not so great for family members or when I need something right away.

My preferred way to get a quick scan is by email. You immediately have it on whatever device you are using, and you have it saved for later if you need it. The Brother Scanner has a scan-to-email function, but it is buggy. Specifically, it sends slightly malformed emails that Fastmail accepts but Gmail returns to sender.

But since network scanning is rock solid, last year I wrote a program to watch a set of folders and send by email whatever files it finds. I was on a bit of a Go kick at the time, and I think Go works pretty well for the task.

Here is the relatively short program that I wrote. It is meant to be running as a daemon, and as long as it is running, it will email you all the files that it finds. Since the interesting parts are at the end and I don’t think anyone will read that far, I’ll show you the program in reverse order, starting with the main function.

 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
func main() {
    fmt.Println("Starting up")
    changes := make(chan int)
    go waitForChange(changes)

    ticks := time.Tick(5 * time.Minute)
    for {
        err := filepath.Walk("/home/scanner/incoming", sendFile)
        if err != nil {
            fmt.Println(err)
        }
        select {
        case <-changes:
            fmt.Println("File changed, looking for emails to send")
        case <-ticks:
            fmt.Println("5m has passed, looking for emails to send")
        }
    }
}

We start up, print a nice message and make a channel, which is a Go structure that allows us to pass data, in this case integers, between different threads of the program. We call it changes because it will notify us every time there has been a filesystem change.

The go keyword on line 99 starts waitForChange on a separate thread, which is a good thing for us, because you will later see that it runs an infinite loop. We pass it the channel so that it can notify us when it sees a change.

On line 101 we get a Ticker channel, which will receive a signal every five minutes. Since I don’t completely trust that I will be notified every time a file changes, every once in a while we want to look through the directories to see if we find anything.

Starting at line 102, we have an infinite loop here in the main thread. This starts by mailing out any files that are waiting. Then we get to the select statement, which pauses and listens to both the changes channel, and the ticks channel. The somewhat strange arrow syntax means that we are attempting to read values from the channel. If we wanted, we could assign the values we read to a variable and do something with them, but we don’t care what is on the channels. As soon as another thread writes to one of these channels (whichever channel comes first), we write the corresponding log statement and then continue back to the top of the for loop, which mails out any files we find, and then goes back to waiting for action on the channels.

By having the two channels, we have programmed the logic to walk the filesystem every time we see a change and every 5 minutes, but, crucially, never more than once at a time. In reality, the change watcher is very reliable, and the email generally arrives seconds after the paper comes out of the scanner.

83
84
85
86
87
88
89
90
91
92
93
94
func waitForChange(c chan<- int) {
    defer close(c)
    for {
        cmd := exec.Command("/usr/bin/inotifywait", "-r", "-e", "CLOSE_WRITE", "/home/scanner/incoming")
        err := cmd.Run()
        if err != nil {
            fmt.Printf("%v\n", err)
            time.Sleep(5 * time.Minute)
        }
        c <- 0
    }
}

Here is the waitForChange function, which just calls inotifywait. This in turn runs until someone writes a file and then exits. At this point, our function writes 0 into the channel, which kicks the main thread into action. Meanwhile this thread calls inotifywait again, to begin waiting for the next change.

56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
func sendFile(path string, info os.FileInfo, err error) error {
    if err != nil {
        return err
    }
    if info.IsDir() {
        return nil
    }
    to, err := getToAddress(path)
    if err != nil {
        return err
    }
    isOpen, err := fileIsOpen(path)
    if err != nil {
        return err
    }
    if isOpen {
        return fmt.Errorf("Skipping %v because it is opened by another process\n", path)
    }
    if err := sendEmail(to, path); err != nil {
        return err
    }
    if err := os.Remove(path); err != nil {
        return err
    }
    return nil
}

This sendFile function is called on every file in the directory tree. This is where Go gets annoyingly verbose. So much error handling! But fairly straightforward. As we walk the tree, we skip directories, send out emails if we have a file, and then delete the file after we send it.

44
45
46
47
48
49
50
51
52
53
54
func fileIsOpen(path string) (bool, error) {
    cmd := exec.Command("/usr/bin/lsof", "-t", path)
    err := cmd.Run()
    if err == nil {
        return true, nil
    }
    if exitError, ok := err.(*exec.ExitError); ok && exitError.ExitCode() == 1 {
        return false, nil
    }
    return true, err
}

This fileIsOpen function wasn’t there at first, but my early tries sent out files that were still being uploaded. Live and learn.

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
func sendEmail(to string, doc string) error {
    msg := gomail.NewMessage()
    msg.SetHeader("From", "scanner@xxxx")
    msg.SetHeader("To", to)
    msg.SetHeader("Subject", "Here is your scanned document")
    msg.SetBody("text/plain", "")
    msg.Attach(doc)

    n := gomail.NewDialer("smtp.example.com", 465, "user", "pass")

    if err := n.DialAndSend(msg); err != nil {
        return err
    }

    fmt.Printf("Sent %v to %v\n", doc, to)
    return nil
}

It is relatively simple to send an email using this third-party gomail package. And it isn’t malformed like the scanner’s attempts to send email!

12
13
14
15
16
17
18
19
20
21
22
23
24
func getToAddress(path string) (string, error) {
    var addresses = map[string]string{
        "amy":        "xxx@yyyy",
        "nathan":     "xxx@wwww",
    }

    _, lastDirName := filepath.Split(filepath.Dir(path))
    to, ok := addresses[lastDirName]
    if !ok {
        return "", fmt.Errorf("No email address available for %v", lastDirName)
    }
    return to, nil
}

This is a relatively simple function that decides who to email to based on the folder that the file is in. This is the abbreviated version; my older kids also have emails configured here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
package main

import (
    "fmt"
    gomail "gopkg.in/gomail.v2"
    "os"
    "os/exec"
    "path/filepath"
    "time"
)

Finally, the most boring part of all, the import section.


New Computer

I bought a new computer. The Beelink U59 Mini PC 11th Gen 4-Cores N5105 cost me $120 from Amazon. It has a somewhat recent Intel Processor, 8GB RAM and a 500 GB SSD. It came with Windows installed, but I erased it and installed Debian with no GUI. It came with instructions to configure autoboot on power failure so I can hide it away in a closet.

This fills a gap I’ve had for some time. My house has been laptop-only for several years now, which leaves me nowhere to run always-on automation things. I have a Synology that I use for storage, but it is very low powered and somewhat hard to configure. My blog is served from a Linode virtual server, which can also fill this gap, but it is also resource-limited, less secure, and far from most of my data.

It has been fun to play with. I migrated some things from Linode to my home server and downsized the Linode. I set up a reverse proxy for my printer and scanner so that I can access the admin pages from outside of my house. I set up a Gitea server for some of my personal things. We will see what other useful purposes it can serve.


Leap Years

On Sunday, I was reading a page on Wikipedia, when this quote caught my attention:

The Gregorian leap cycle, which has 97 leap days spread across 400 years, contains a whole number of weeks (20871).

This is somewhat surprising to me, since I assume that this was by accident and not by design, which means there was only a 1-in-7 chance of it happening.

One result of this is that the calendar for 2000 is the same as the calendar for 2400, which makes a perpetual calendar such as this one quite a bit easier to specify.

Another consequence is that not every calendar is equally likely, which I captured in this silly Mastodon post:

Did you know that October 8 is 3.5% less likely to fall on a Sunday than a Saturday? Enjoy your rare day!

I have few followers, but I was pretty sure my old internet pal Dr. Drang would take the bait (he did).

If I were more patient, I could have waited until next February 29, which falls on a Thursday. Since leap days are more rare than non-leap days, the disparity is greater, and my 3.5% could have been 14%. Wednesday is the more common day for February 29, which again reminds us that we are currently on one of the less common paths through the perpetual calendar.

But calendars are fun to think about, so I didn’t stop there.

A year is about 365.24219 days long. The Julian calendar has a leap year every 4 years, for an average year length of 365.25 days. The error, of course, adds up relatively quickly, as we eventually noticed.

The Gregorian calendar skips leap years on years divisible by 100, unless they are also divisible by 400, leaving the 97 leap years per 400 years quoted above. This makes a year on average 365 + 97/400 = 365.2425 days long, which is closer! It takes 3,225 years before you drift a day, which I guess is good enough?

At this point, I stumbled upon the Revised Julian calendar, which skips leap years on years divisible by 100, unless they are also either 200 or 600 mod 900. This makes a year 365 + 218/900 = 365.24222 days long, which is even better. Now it takes over 30,000 years before you drift a day. The rule is much more confusing, though, although it has the benefit (by design) that it matches the Gregorian calendar exactly for the years 1600-2799. This lets you claim you are following a more accurate calendar without really making a fuss. Also, 900 years of the Revised Julian calendar is not a whole number of weeks, so the Revised Julian perpetual calendar would actually have a 6,300-year cycle.

Finally, I spent some time thinking about what I would have done to handle the leap year problem if I ran the world. The answer is so obvious that it makes you doubt the wisdom of our ancestors. The Julian calendar drifts by 1 day every 128 years. We could have had a calendar (could I call it the Griggorian?) that had leap years every year divisible by 4 except those also divisible by 128. This gets you a year with an average length of 365 + 31/128 = 365.2421875 days, which would mean one day of drift every 400,000 years.

I admit that computing divisibility by 128 is harder (for a human) than 400, but otherwise the rule is clearly simpler. The other downside is that 128 years of this calendar is not a whole number of weeks, which means that the perpetual calendar would have a 896-year cycle. But as long as I’m in charge, we might as well solve that by starting every year on a Sunday.


Some electric analysis

I bought an electric car last month, which got me interested in my electric bill. I was surprised to find out that my electric company lets you export an hour-by-hour usage report for up to 13 months.

There are two choices of format: CSV and XML. I deal with a lot of CSV files at work, so I started there. The CSV file was workable, but not great. Here is a snippet:

"Data for period starting: 2022-01-01 00:00:00  for 24 hours"
"2022-01-01 00:00:00 to 2022-01-01 01:00:00","0.490",""
"2022-01-01 01:00:00 to 2022-01-01 02:00:00","0.700",""

The “Data for period” header was repeated at the beginning of every day. (March 13, which only had 23 hours due to Daylight Saving Time adjustments, also said “for 24 hours”.) There were some blank lines. It wouldn’t have been hard to delete the lines that didn’t correspond to an hourly meter reading, especially with BBEdit, Vim, or a spreadsheet program. But I was hoping to write something reusable in Python, preferably without regular expressions, so I decided it might be easier to take a crack at the XML.

Here is the general structure of the XML:

<entry>
  <content>
    <IntervalBlock>
      <IntervalReading>
        <timePeriod>
          <duration>3600</duration>
          <start>1641024000</start>
        </timePeriod>
        <value>700</value>
      </InvervalReading>
    </IntervalBlock>
  </content>
</entry>

Just like the CSV, there is an entry for each day, called an IntervalBlock. It has some metadata about the day that I’ve left out because it isn’t important. What I care about is the IntervalReading which has a start time, a duration, and a value. The start time is the unix timestamp of the beginning of the period, and the value is Watt-hours. Since each time period is an hour, you can also interpret the value as the average power draw in Watts over that period.

XML is not something I deal with a lot day to day, so I had to read some Python docs, but it turned out very easy to parse:

from xml.etree import ElementTree
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt

ns = {'atom': 'http://www.w3.org/2005/Atom', 'espi': 'http://naesb.org/espi'}
tree = ElementTree.parse('/Users/nathan/Downloads/SCE_Usage_8000647337_01-01-22_to_12-10-22.xml')
root = tree.getroot()
times = [datetime.fromtimestamp(int(x.text))
         for x in root.findall("./atom:entry/atom:content/espi:IntervalBlock/espi:IntervalReading/espi:timePeriod/espi:start", ns)]
values = [float(x.text)
          for x in root.findall("./atom:entry/atom:content/espi:IntervalBlock/espi:IntervalReading/espi:value", ns)]
ts = pd.Series(values, index=times)

The ns dictionary allows me to give aliases to the XML namespaces to save typing. The two findall commands extract all of the start tags and all of the value tags. I turn the timestamps into datetimes and the values into floats. Then a make them into a Pandas Series (which, since it has a datetime index, is in fact a time series).

My electricity is cheaper outside of 4-9 p.m., so night time is the most convenient time to charge. I made a quick visualization of the last year by restricting myself from midnight to 4:00 a.m. andtaking the average of each day. Then I plotted it without lines and with dots as markers:

plt.plot(ts[ts.index.hour<4].groupby(lambda x: x.date).mean(), ls='None', marker='.')

As expected, you see moderate use in the winter from the heating (gas, but with an electric blower). Then a lull for the in-between times, a peak in the summer where there is sometimes a bit of AC running in the night, another lull as summer ends, and then a bit of an explosion when I started charging the car.

For now, I am actually using a 120 V plug which can only draw 1 to 1.5 kW and is a slow way to charge a car. Eventually I will get a 240 V circuit and charger, increase the charging speed 5x, and have even bigger spikes to draw.


Amazon accounts, Kindle, and families

I’m back from a blogging hiatus for a quick complaint about the sorry state of Amazon’s account system, especially when it comes to households and minors.

Everything that follows is to the best of my knowledge, and only includes the features I actually use.

A regular Amazon account can be used for shopping, Kindle, and Prime Video (among other things). You can have a maximum of two regular Amazon accounts in a household, and they can share Prime shipping benefits and Kindle purchases, but not Prime Video. However, under the primary member’s Prime Video login, you can have sub-profiles to separate household members.

On a Kindle device, you can share ebook purchases with minors using Amazon Kids. This is not a true Amazon or Kindle account, but a sub-account within a regular Amazon account. That is, you sign into the Kindle with the parent’s account and then enter Kid Mode. All purchases (or library check-outs) must be made on the parent’s account and then copied over to the child’s library using a special Amazon dashboard.

Note that Amazon Kids+ is a different product: it is basically Kindle Unlimited for Amazon Kids accounts. I have used it and I think the selection is terrible. For example, they love to carry the first book of a series but not the remainder of the series. Also, when I last used it, there was no way to know which books are available through Amazon Kids+ short of searching for the book on a kid’s device.

There is a shopping feature called Amazon Teen. This is essentially a regular Amazon account, but it is linked to a parent’s account, and purchases are charged to the parent’s card, with the option to require purchase-by-purchase approval from the parent. This is a way to share Prime shipping features with a teenager, and the only way to share Prime shipping with more than a single person in your househould. Crucially, Amazon Teen accounts cannot purchase Kindle books, log into a Kindle device, or share Kindle purchases with the parent’s account.

Until now, I have mostly survived in the Amazon Kids world, despite the friction involved in getting a book onto a kid’s device. My kids have mostly adapted by ignoring their Kindles and reading books in Libby on their phones. This isn’t a good fit for my teen and tween, who need to read books at school. They are not allowed to use phones at school, but are allowed to use e-ink Kindles.

Everything came to a head this weekend, when I tried to make them both Amazon Teen accounts, which are useful in their own right. (The current practice is that they text me an Amazon link when they need something, and it will be nice for them to be a little be more self-sufficient.) This was before I knew that Amazon Teen accounts couldn’t buy Kindle books (why?), so I then attempted to create them each a second account, not linked to mine in any way, for Kindle purposes.

That is when things came to a screeching halt, but this is at least partially my fault. While I had been looking into this, I was downloading Kindle books to my computer using a Keyboard Maestro script that simulated the five clicks required for each download. I’m pretty sure that this triggered some robot-defensive behavior from Amazon, which made it impossible for me to create an account without a cell phone attached to the account. But all of our household phone numbers are already attached to other accounts, and attempting to remove them put me into an infinite loop of asking for passwords and asking for OTPs.

I eventually solved this problem in two different ways. One involved talking to a human at Amazon’s tech support, which I admit is better than many of the other tech companies at solving this kind of problem. The other involved a VPN, which seems to have freed me from bot-suspicion.

But in the end, I also put in an order for a Kobo. I’m told they can sync directly with Libby for library checkouts, unlike Amazon which requires a complex multi-click dance which might prevent my kids from using their Kindles even if I do get their accounts squared away. And these are the last major micro-USB devices in the house, so maybe the time has come to move on. Ironically, the only way I could find a Kobo that shipped in less than a week was to buy it from Amazon.


The last couple restaurants I’ve visited used Toast for payments, and this is what QR codes were meant for. The receipt has a QR code specific to your order, on iOS it opens an App Clip, and you can pay with Apple Pay.

Great experience, especially compared to the old way. Waiting for the server to pick up and return your credit card is the worst.


Fastmail JMAP backup

(updated

[Update: Since I first wrote this, Fastmail switched from using HTTP BasicAuth to Bearer Authorization. I have updated the script to match.]

I use Fastmail for my personal email, and I like to keep a backup of my email on my personal computer. Why make a backup? When I am done reading or replying to an email, I make a split-second decision on whether to delete or archive it on Fastmail’s server. If it turns out I deleted something that I need later, I can always look in my backup. The backup also predates my use of Fastmail and serves as a service-independent store of my email.

My old method of backing up the email was to forward all my email to a Gmail account, then use POP to download the email with a hacked-together script. This had the added benefit that the Gmail account also served as a searchable backup.

Unfortunately the Gmail account ran out of storage and the POP script kept hanging for some reason, which together motivated me to get away from this convoluted backup strategy.

The replacement script uses JMAP to connect directly to Fastmail and download all messages. It is intended to run periodically, and what it does is pick an end time 24 hours in the past, download all email older than that, and then record the end time. The next time it runs, it searches for mail between the previous end time and a new end time, which is again 24 hours in the past.

Why pick a time in the past? Well, I’m not confident that if you search up until this exact moment, you are guaranteed to get every message. A message could come in, then two seconds later you send a query, but it hits a server that doesn’t know about your message yet. I’m sure an hour is more than enough leeway, but since this is a backup, we might as well make it a 24-hour delay.

Note that I am querying all mail, regardless of which mailbox it is in, so even if I have put a message in the trash, my backup script will find it and download it.

JMAP is a modern JSON-based replacement for IMAP and much easier to use, such that the entire script is 140 lines, even with my not-exactly-terse use of Python.

Here is the script, with some notes below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
import argparse
import collections
import datetime
import os
import requests
import string
import sys
import yaml

Session = collections.namedtuple('Session', 'headers account_id api_url download_template')


def get_session(token):
    headers = {'Authorization': 'Bearer ' + token}
    r = requests.get('https://api.fastmail.com/.well-known/jmap', headers=headers)
    [account_id] = list(r.json()['accounts'])
    api_url = r.json()['apiUrl']
    download_template = r.json()['downloadUrl']
    return Session(headers, account_id, api_url, download_template)


Email = collections.namedtuple('Email', 'id blob_id date subject')


def query(session, start, end):
    json_request = {
        'using': ['urn:ietf:params:jmap:core', 'urn:ietf:params:jmap:mail'],
        'methodCalls': [
            [
                'Email/query',
                {
                    'accountId': session.account_id,
                    'sort': [{'property': 'receivedAt', 'isAscending': False}],
                    'filter': {
                        'after': start.isoformat() + 'Z',
                        'before': end.isoformat() + 'Z',
                    },
                    'limit': 50,
                },
                '0',
            ],
            [
                'Email/get',
                {
                    'accountId': session.account_id,
                    '#ids': {
                        'name': 'Email/query',
                        'path': '/ids/*',
                        'resultOf': '0',
                    },
                    'properties': ['blobId', 'receivedAt', 'subject'],
                },
                '1',
            ],
        ],
    }

    while True:
        full_response = requests.post(
            session.api_url, json=json_request, headers=session.headers
        ).json()

        if any(x[0].lower() == 'error' for x in full_response['methodResponses']):
            sys.exit(f'Error received from server: {full_response!r}')

        response = [x[1] for x in full_response['methodResponses']]

        if not response[0]['ids']:
            return

        for item in response[1]['list']:
            date = datetime.datetime.fromisoformat(item['receivedAt'].rstrip('Z'))
            yield Email(item['id'], item['blobId'], date, item['subject'])

        # Set anchor to get the next set of emails.
        query_request = json_request['methodCalls'][0][1]
        query_request['anchor'] = response[0]['ids'][-1]
        query_request['anchorOffset'] = 1


def email_filename(email):
    subject = (
            email.subject.translate(str.maketrans('', '', string.punctuation))[:50]
            if email.subject else '')
    date = email.date.strftime('%Y%m%d_%H%M%S')
    return f'{date}_{email.id}_{subject.strip()}.eml'


def download_email(session, email, folder):
    r = requests.get(
        session.download_template.format(
            accountId=session.account_id,
            blobId=email.blob_id,
            name='email',
            type='application/octet-stream',
        ),
        headers=session.headers,
    )

    with open(os.path.join(folder, email_filename(email)), 'wb') as fh:
        fh.write(r.content)


if __name__ == '__main__':
    # Parse args.
    parser = argparse.ArgumentParser(description='Backup jmap mail')
    parser.add_argument('--config', help='Path to config file', nargs=1)
    args = parser.parse_args()

    # Read config.
    with open(args.config[0], 'r') as fh:
        config = yaml.safe_load(fh)

    # Compute window.
    session = get_session(config['token'])
    delay_hours = config.get('delay_hours', 24)

    end_window = datetime.datetime.utcnow().replace(microsecond=0) - datetime.timedelta(
        hours=delay_hours
    )

    # On first run, 'last_end_time' wont exist; download the most recent week.
    start_window = config.get('last_end_time', end_window - datetime.timedelta(weeks=1))

    folder = config['folder']

    # Do backup.
    num_results = 0
    for email in query(session, start_window, end_window):
        # We want our search window to be exclusive of the right endpoint.
        # It should be this way in the server, according to the spec, but
        # Fastmail's query implementation is inclusive of both endpoints.
        if email.date == end_window:
            continue
        download_email(session, email, folder)
        num_results += 1
    print(f'Archived {num_results} emails')

    # Write config
    config['last_end_time'] = end_window
    with open(args.config[0], 'w') as fh:
        yaml.dump(config, fh)

The get_session function is run once at the beginning of the script, and fetches some important data from the server including the account ID and a URLs to use.

The query function does the bulk of the work, sending a single JSON request multiple times to page through the search results. It is actually a two-part request, first Email/query, which returns a list of ids, and then Email/get, which gets some email metadata for each result. I wrote this as a generator to make the main part of my script simpler. The paging is performed by capturing the ID of the final result of one query, and asking the next query to start at that position plus one (lines 77-78). We are done when the query returns no results (line 69).

The download_email function uses the blob ID to fetch the entire email and saves it to disk. This doesn’t really need to be its own function, but it will help if I later decide to use multiple threads to do the downloading.

Finally, the main part of the script reads configuration from a YAML file, including the last end time. It loops through the results of query, calling download_email on each result. Finally, it writes the configuration data back out to the YAML file, including the updated last_end_time.

To run this, you will need to first populate a config file with the destination folder and your API token, like this:

token: ffmu-xxxxx-your-token-here
folder: /path/to/destination/folder

You will also need to install the ‘requests’ and ‘pyyaml’ packages using python -m pip install requests pyyaml. Copy the above script onto your computer and run it using python script.py --config=config_file. Note that everything here uses Python 3, so you may have to replace ‘python’ with ‘python3’ in these commands.


Productive couple of days for my rather neglected Linode instance. Upgraded the distro from Ubuntu 14.04 to Debian 10. Moved DNS from Amazon to Google. Moved various static sites from S3 to Linode. Somehow it all still works.


Reading feeds in a world of newsletters

I understand the popularity of email newsletters, especially for publishers. It’s a simple way to get paid content out, easier for users than a private RSS feed. But that doesn’t mean I want to read newsletters in my email app.

Feedbin, which I am already using for my regular RSS subscriptions, bridges the gap. As part of my Feedbin account, I get a secret email address, and anything sent to that address ends up in my RSS reader. Problem solved!

But it quickly gets annoying to sign up for newsletters (often creating an account) with an email address that is neither memorable nor truly mine. Fastmail, which I am already using for my regular email, makes it easy to find specified emails sent to my regular address, forward them to my feedbin address, and put the original in the trash.

In fact, Fastmail lets me use “from a member of a given contact group” as the trigger for this automatic rule, which makes the setup for a new newsletter very simple:

  1. Subscribe to the newsletter
  2. Add the sender to my Fastmail address book
  3. Add the newly created contact to my “Feedbin” group

This is very convenient, for newsletters as well as other mail that is more of a notification than an email. Here are some of the emails that I now read as though they were feeds: