The making of WebInfoKeeper

Published on November 5, 2024

1. Introduction and motivation

There are several pieces of information that I consistently want access to when browsing.

For instance, most job applications ask for my linkedin URL. My current workflow for getting that URL is to a) open linkedin.com, b) go to my profile, c) press ctrl+l to access the search bar, and d) press ctrl+c to copy the address. It's a bit unwieldy, and I'd prefer a way to be able to easy access that information.

Another example is remembering good prompts for LLMs. With certain types of coding questions, I find it is helpful to ask Claude to not provide a working solution until I tell it to. There are a couple of phrases I like to use, but why should I have to type them out each time?¹ It would be nicer if I had a quick way of storing and then accessing that information.

One solution would be to use a document of helpful information. If I had a functioning note-taking system, I could even use that! This could work, but still poses a good deal of friction. Every time I have piece of information I'd like to store, I would have to remember about my paste sheet, open the document, and then paste the result. I'd probably also need to add some context so that I could remember what this information is for.

Given that I primarily use my computer through the Internet, I figured that I could make a good browser extension for this.² And so came WebInfoKeeper! [I will eventually link to a less detailed project description].

In this post, I will break down my design and implementation process for the extension, as well what I learned.

2. High level design details:

I came to this idea right before my bedtime, so I furiously scrawled some notes for the basic design. The basic idea is to make a wrapper for the browser's localStorage API. The user can associate a value with a key, and later access the value through the key. (To complete the CRUD design, there is also a way to update and store values).

Throughout this, I set a guiding principle:

"Access should be easy and creation even easier."

The rationale: the purpose of the extension is to reduce the friction of accessing stored information, but storing information has more default friction. When I am trying to accessing a piece of information, I have a strong motivation to find it. However, in order to efficiently store information, I need to remember that storage is a possibility. In the middle of a task, that should be as minimal a disruption as possible.

To make my access easier, my first priority was to include keyboard shortcuts (which are already part of my default workflow). I'd add other modalities later, but the ultimate goal is to minimize friction.

3. Version 1.0

Given the simplicity of the design, and a desire to build, my first goal was to get a working version in a day.

Overall, the process went quite smoothly. The primary decision I had to make was how I was going to create the input form. Without too much thought, I decided to go with a modal dialog box. I wanted to duplicate some of the design decision of Apple's Spotlight system for switching between tabs. I also thought it would give me maximal flexibility to style the system.

Writing the wrapper proved quite simple; most of the storage is abtracted away using the Browser APIs.

This is what the popup initially looked like:

Version1.0_Screenshot.png

After pressing alt+c, any selected text was included as the "copied value," and you were prompted to include a key.

Getting a value was similarly simple. One pressed alt+c (as alt+v is taken), and got a dialog. Unfortunately, you initially had to remember the exact key, but if you did, the eventual value was added to your clipboard.³

Version1.0_Screenshot_get.png

3. Version 1.1: Fixing some janky elements

While I was proud that the extension was now in working order, it wasn't particularly usable yet. There were two big issues:

You had to remember your key exactly and type it in without any help
On pages where keyboard inputs affect the page (like any form), adding text into the popup still affected the underlying page.

Remembering your key exactly:

I needed some way to get, sort, and display the keys. Luckily, this was quite straightfoward to do. To get the key value, I had been using an <input> html element; all I needed to do was include a list=keystrs keyword, and create a datalist element keystrs.

I ended up modifying my background script to include a method which passed all of the keys to the content script as an array. I then iterated through that array, and added datalist items:

keys.forEach(function(key) {
    const option = document.createElement('option');
    //confirm(`option is ${option}`)
    option.value = key;
    keyStrs.appendChild(option);
  });

The end result was that a list of possible keys floated under your input!

Version1.1_Screenshot.png

Preventing keyboard inputs from affecting the rest of the webpage

I was initially concerned that this was going to be quite difficult. It turned out to be an easy switch. I included a eventListener on my dialog element for keystrokes; all I needed to do was prevent keyboard events from propagating up the document tree!

Other changes

I added the ability to access these function using the context window:

Version1.1_Screenshot_context.png

I tried for a while to get my two commands to show up on the parent menu, but I don't think it's possible. That works against my goal of reducing the friction as much as possible, but given that the keyboard shortcut is the main show, I didn't worry about it too much.

I also switched from Manifest V2 to Manifest V3, in preparation for eventually adding Chrome support.

Version 1.1 was definitely in a much better state than Version 1.0. There were a couple of things that still bothered me though, which ended up prompting a major refactor of my code.

While injecting custom html into webpages gave me a lot of flexibility, it had a couple of downsides. First, there are some pages where it is impossible to inject html. The most important is the about:new page. That meant that my script just didn't work when I was in a new tab. Second, after I had completed getting a value, my cursor was not selected anywhere. This added a little bit of friction when trying to paste a value. Third, when I was on the navigation bar and triggered the shortcut, my cursor did not move to the dialog box (because it was already outside of the web page).

All three of these downsides are rather minor, but all three add friction to the storing and accessing information, especially for keyboard users. I needed a different way to query information.

Deciding where to refactor

As attached as I was to the modal dialog box, it wasn't a tractable option. Initially, I was split between using query, a browser method which displays a little popup to get information from a user, and creating a new window to overlay over the current webpage. Ultimately, however, query didn't have enough customization and creating a new window felt like overkill. After discussing the issue with Claude Sonnet, I ended up settling on a third solution: using the built-in browser popup.

I rewrote the extension so that when I pressed alt+c or alt+g, it opened the browser popup and then queried me for the requisite information. This required learning quite a bit more about html and CSS.

The first goal was to try to replicate the popup look. I looked through documentation, but there does not appear to be a way to make an extension popup act as a modal dialog. (In fact, there was a proposal to add this capability to chrome, but it was rejected). My second attempt was to make the background of the extension transparent, and then place the text box at the center of the screen, simulating a modal dialog. While this is possible in chrome, it is not in Firefox, so I eventually gave up trying to simulate the popup experience.

Once I accepted that the popup would look different, I needed to set up the html. This required some refactoring: because I was not injecting html into an existing page, I had to set up the entire layout. The browser popup API gives a single html document, but I needed to represent at least three distinct pages. Given the small scope of the project, I decided to use <div> elements which I would make visible and invisible at will. The resulting structure looked like:

<div id="input-mode" style="display:none;">
    Things for setting a value
</div>
<div id="output-mode" style="display:none;">
    Things for getting a value
</div>
<div id="default-mode">
    Home page for the extension
</div>

Using JavaScript, I change the display condition for the results as necessary. The remainder of the display code transferred line for line from the existing modal dialog.⁴ Later in the process, I am also adding pages for updates, as well as settings. But I have not implemented those yet. This proved a helpful exercise for using html.

The current UI:

Version1.2_Screenshot.png

Dealing with the clipboard

The popup works much closer to the low-friction spec. There's only one tradeoff: if I have selected text, there is no way for me to combine copying the text and adding the key. Previously, alt+c copied the text and added it to the "Value" textbox simultaneously. However, as I no longer have access to the underlying webpage, the best I can do is to take the current item on the clipboard and display it as a suggestion for the "value."⁵

There was one subtlety. For security reasons, an extension cannot programmatically read from the browser clipboard OR open a popup except as the result of direct user action. I am not quite sure exactly how this works, but I had to play around with my code structure in order to let the extension both copy from the clipboard and open the popup simultaneously.

Refactoring communication between popup.js and background.js

The earlier versions used browser.tabs.sendMessage() in order to instruct a content script to display a popup. The content script would then return a promise object which eventually resolved with the user information (the key and value strings, when inputed).

This approach worked well when the only way to trigger a query was through the background script. However, with the addition of a popup, I decided to add the ability to input new key value pairs through the popup itself:

Version1.2_Screenshot_Popup.png

I switched to a "fire and forget" model. Any time the popup completed an action, it would send a message to background.js with the relevant information. In return, background.js could send messages to popup.js without worrying about recieving the response in the same code block. This ended up making the code much more modular, and the code deals with the three separate modalities (keyboard shortcut, context window, and popup) more smoothly now!

Other changes:

These were the main changes. I have worked a little bit on styling the popup window, but have not settled on a final design yet.

5. What's next?

Version 1.2 is almost ready to be distributed to Firefox. I need to implement the List Keys feature and add a small setting page, but it is mostly in working order.

I have learned a lot from this project, but I mostly want a working version now, so I don't see making a large number of improvements in the future. However, there are at least three things I would like to address:

I would like to confirm that I am handling user inputs safely. I am displaying user inputs (from the clipboard) which might make me vulnerable to cross scripting attacks. I am doing some sanitation, but would like to learn more about cleaning inputs.
The extension is already written in Manifest V3; I would like to port a version to Chrome.
I need to write a better description for the extension, as well as create a small demo. Multiple people have expressed interest in the extension; I want to make sure it is easy to use and download!

For reference, here is the prompt I like to use: "Do not provide a complete working code sample, but feel free to add snippets to help illustrate your point." Yes, it's two sentences. I'm being lazy. But this extension makes it easier to iterate on that prompt. ↩
I have also treated as good practice for the ScreenLike project, which I have described elsewhere. Being an easier problem, I could isolate aspects of extension design which are harder to approach in the screen time setting. ↩
The weird spacing of hte dialog here is due to a misunderstanding I had between visibility: hidden and display:none. In the above example, I set the visibility of the value textarea to be hidden, but this meant it stayed in the layout. ↩
I have also been practicing my CSS skills on the resulting layout, but I don't have anything interesting to say about that. It has taught me how difficult it is to write good ids and class names for variables! ↩
If I were to restart this project, I might write it as a small desktop application (or through AutoHotKey) in order to get better access to the "copy" and "paste" actions. ↩