Link migration nonsense

Jul 5, 2025, 8:34 AM
Δ
Jul 6, 2025, 7:52 AM
Julia SiteV2

Instead of doing something productive with my life, I wrote a couple functions to take my existing mess of “link” Markdown files and translate them into a different mess of “link” Markdown files.

The initial collection was the result of taking a few hundred open Safari tabs and sharing each one to Obsidian, which got me a ton of notes with idiosyncratic filenames and no metadata. The new one is a lot more structured, with a unique numeric filename, a title (derived from the previous filename, if possible), a url field for the link itself, passing along any existing metadata, and preserving any contents of the note if there was something after the hyperlink.

This is all recorded in Markdown metadata, but in the back of my head I’ve been trying to think about it as entity-relationship/graph/relational stuff, where Markdown metadata is just a handy way of recording more complex relationships (e.g. if there are tags, the note doesn’t own those tags, it’s just a convenient place to record those relationships). Maybe one day I’ll make more progress on that train of thought.

Anyways. Final Julia code for translating:

# Mimicking repl setup; should switch away from `using` sometime.
using CommonMark
using YAML
using Glob
using Dates
using TimeZones

function custom_format!(ast::CommonMark.Node, extra=Dict())
    keeporiginal = (a, b) -> a  # Little helper for avoiding merge overwrites

    firstchild = ast.first_child

    if firstchild.t isa CommonMark.FrontMatter
        fmatter = firstchild
    else
        fmatter = CommonMark.Node(CommonMark.FrontMatter("---"))
        CommonMark.prepend_child(ast, fmatter)
    end

    # Grab file name stub
    stub = match(r"([^/]+?).md", ast.meta["source"])[1]

    # Grab URL from content (making wild assumptions
    # that it'll be the first thing after the frontmatter)
    try
        url_node = fmatter.nxt
        url = join(child.literal for (child, _) in url_node)
        mergewith!(keeporiginal, fmatter.t.data, Dict("url" => url))
        
        # If we got the URL node, we can also cut it from the
        # body so it's not duplicated info.
        CommonMark.unlink(url_node)

    catch e
        println("Error getting URL for: $stub")
    end

    # Add title frontmatter data
    mergewith!(keeporiginal, fmatter.t.data, Dict("title" => stub))

    # Merge extras
    mergewith!(keeporiginal, fmatter.t.data, extra)

    # Update frontmatter literal (so it will be written correctly)
    fmatter.literal = YAML.write(fmatter.t.data)

    return ast
end


function convert_directory!(dir::AbstractString, starttime::AbstractString)
           
    parser = Parser()
    enable!(parser, FrontMatterRule(yaml=YAML.load))

    startstamp = Dates.DateTime(starttime, "yyyymmddHHMM")
    startstamp = ZonedDateTime(startstamp, tz"America/Phoenix")

    for (i, pth) in enumerate(glob("*.md", dir))
        # Get a timestamp
        tstamp = startstamp + Dates.Minute(i)

        # Load & modify
        ast = open(parser, pth)
        ast = custom_format!(ast, Dict("created" => tstamp))

        # Rewrite to new file
        filename = "$(Dates.format(tstamp, "yyyymmddHHMM")).md"
        open(filename, "w") do file
            markdown(file, ast)
        end
    end
end

Maybe I should write up more about how this could’ve been better. It's pretty ugly!

Going forwards, I want a nicer way to dump new links into this “improved” format, since setting up all the fields is a bit finicky right now.