gclv

I write a lot. I like to write, a lot. I write for thinking, and often thinking without writing feels like trying to remember one song while another song is playing: a feat of raw mental strength.

I write in English, though I mostly don't think in English, because it has a more utilitarian feel to it. When I write in Portuguese, I often get lost in “that's a beautiful word”, or “that sentence flows so well”. Of course, that says more about my ability to produce written English than the aesthetics of the English language itself.

But I never publish what I write, because why would I? Once the thinking is done, it's done: I get to own the outcome of it for myself. It's unlikely I will want to read it in the future. But Listed's 100-day writing challenge has made me pause.

Don't worry, reader: I'm not posting things every day for 100 days. But could there be value in putting one's thoughts out there? Making things public certainly has a finality to it: once the word is out, it's out. So the idea of writing things in public has a boldness to it that appeals to me.

As an example, Belle Beth Cooper's posts are deeply personal, yet profoundly interesting to me. Here I am, reading about how much money this stranger from the other side of the world has spent on stationery, and throughout the post I'm saying to myself “well done, Belle!”. There's certainly others I follow with the same level of intensity.

So that's something I may pursue: writing less based on the persona I want to project online for whatever reason — that's too much effort to be sustainable —, and think more in terms of just posting the things I write to myself. Less in the hopes that someone might read them and find them useful — that would be a long shot indeed! —, and more because of what it can do to me as I write.

In no particular order.

  • Listening to podcasts. I'm very fond of podcasts. Having something interesting to listen to all the time has gotten me to walk more, do more dishes, and enjoy an otherwise bad commute (remember those?) Plus, like any new medium, it feels like a close community: there are so many creators I feel very close to. Many years in, however, the downsides are very clear: constant craving for novelty and entertainment, mindlessness, reluctance to connect to people nearby (the few people who are not wearing headphones anyway.) I want to see what the other side is like. Maybe I'm missing out on something. Maybe not.
  • The cloud. This one has been a long time coming. I've been overly reliant on other people's computers. I don't care for video entertainment, so Netflix has never appealed to me, but getting DRM on all my music collection was not a good move. I'm beginning to self-host most of the services I rely on, running from my Synology NAS and Raspberry Pis. I love my Kindle as a device, but the DRM aspect bugs me. I'm on the lookout for a good e-ink device with good support for open formats. I'm now the owner of an Eink tablet with DRM-free epub support.
  • My phone. I notice I barely use it to talk to people anymore: only bots.
  • Closed-source software. Plenty has been written about this. This has the fortunate side-effect of ruling out most phone apps and SaSS (service as a software substitute.)
  • The web. It's beyond repair at this point. There's long been purists talking about “saving the open web”, but that ship has sailed. Web apps are bad many times over, since they use proprietary JavaScript, require internet connectivity, and remove control from the users. Gopher and Gemini seem promising, but I admit I'm very ignorant here.

After re-reading some of the papers from Bell Labs, something clicked in my mind, and I'm hooked. I'm now reading “The UNIX Programming Environment”, by Kernighan and Pike. It's got that fun style you're probably familiar with, if you've read K&R or the blue book. One of the first exercise questions, on the chapter on file systems:

(harder) How does the pwd command operate?

Seems like a fun one. My first guess is that it used $PWD from the environment. Let's test that.

~ % PWD=/usr/local pwd
/home/gg

There's a non-standard -L flag that seems to use $PWD. Maybe that one would do?

~ % PWD=/usr/local pwd -L
/home/gg

Not that either. Wait a second, what's pwd again?

~ % type pwd
pwd is a shell builtin

Hah, so I was calling the wrong one. So I replace pwd with /bin/pwd in my queries above, but the results are the same.

My next hypothesis is that it would somehow expand . to absolute. I'm not aware of a UNIX command that performs such an expansion, so I man -k some keywords. Nothing.

Maybe pwd(1)? It's not terribly descriptive (it's such a simple utility after all.) It doesn't explain the implementation at all, but links me to getcwd(3). Alright, let's just look at the source.

OpenBSD implementation

int
main(int argc, char *argv[])
{
    int ch, lFlag = 0;
    const char *p;

    /* pledge(), parse flags... */

    if (lFlag)
        p = getcwd_logical();
    else
        p = NULL;
    if (p == NULL)
        p = getcwd(NULL, 0);

    if (p == NULL)
        err(EXIT_FAILURE, NULL);

    puts(p);

    exit(EXIT_SUCCESS);
}

Unless -P is passed, it just calls getcwd. Let's see what that “logical” function does:

static char *
getcwd_logical(void)
{
    char *pwd, *p;
    struct stat s_pwd, s_dot;

    /* Check $PWD -- if it's right, it's fast. */
    pwd = getenv("PWD");
    puts("PWD found in the ENV");
    puts(pwd);
    if (pwd == NULL)
        return NULL;
    if (pwd[0] != '/')
        return NULL;

    /* check for . or .. components, including trailing ones */
    for (p = pwd; *p != '\0'; p++)
        if (p[0] == '/' && p[1] == '.') {
            if (p[2] == '.')
                p++;
            if (p[2] == '\0' || p[2] == '/')
                return NULL;
        }

    if (stat(pwd, &s_pwd) == -1 || stat(".", &s_dot) == -1)
        return NULL;
    if (s_pwd.st_dev != s_dot.st_dev || s_pwd.st_ino != s_dot.st_ino)
        return NULL;
    return pwd;
}

So -L does check for $PWD, but only returns it if it's pointing to the same inode, on the same device. You can't just manually override it to be anything you want. In that case, it falls back to the libc call to getcwd.

Makes me wonder what use this -L flag is in the first place. Maybe it has to do with symlinks?

/tmp % mkdir one
/tmp % ln -s one two
/tmp % cd one
/tmp/one % /bin/pwd
/tmp/one
/tmp/one % cd ../two
/tmp/two % /bin/pwd
/tmp/one
/tmp/two % /bin/pwd -L
/tmp/two

Makes sense. Anyway, that's not a very satisfying answer. I doubt the authors' intended answer would have been “defer to the libc”.

Plan9

Ok, OpenBSD source didn't help. But Plan9 is Unicibus ipsis Unicior, so maybe we can find the answer there. Let's inspect pwd(1):

DESCRIPTION Pwd prints the path name of the working (current) directory. Pwd is guaranteed to return the same path that was used to enter the directory. If, however, the name space has changed, or directory names have been changed, this path name may no longer be valid. (See fd2path(2) for a descrip- tion of pwd's mechanism.)

Hah, that was helpful! Now, from fd2path(2):

As an example, getwd(2) is implemented by opening . and exe- cuting fd2path on the resulting file descriptor.

By the way, it turns out that fd2path(2) is a fascinating topic in its own right (cf. “Lexical File Names in Plan 9 or: 'Getting Dot-Dot Right'”).

So my hypothesis above was correct, at least when it comes to Plan9. Also, another cool thing about Plan9 is that it lets me inspect a folder (“everything is a file”, right?)

% cat . > foo
% cat foo

I can then run foo through hexdump and see what's in there.

GNU

Let's see how coreutils implements it… nope. Just nope.

Wrapping up

So that was it, a brief excursion into different implementations of a simple command in UNIX. The difference in complexity is palpable. The Plan9 documentation is fun to read, and so is the code.