This repository uses Git LFS. Please make sure it is installed before cloning this repository.
Fully offline development
Should you want to work on this project, specifically checking out branches/commits other than the current one, without a reliable internet connection, you will need to pre-fetch the LFS objects by running git lfs fetch <branchname>
. If you want a local copy of all LFS data, use git fetch lfs --all
instead.
Rebasing your fork
Should your rebase fail with strange errors, please run git lfs fetch upstream --all && git lfs fetch --all && git lfs push origin --all
(assuming your remote is called origin
and the main repo remote is called upstream
) before the rebase command to get the LFS objects in sync.
Fixing up a preexisting cloned repo
If you cloned the iceshrimp repository before the LFS migration, make sure git-lfs
is installed, and you have a backup of the repository in case something goes wrong. Then use the following commands to get it back in sync:
git fetch --all
git rebase
git lfs pull
git pull --prune
git tag | xargs git tag -d
git fetch --tags
If you've deleted all branches that still reference the old tree, you can run git reflog expire --expire=now --all && git gc --prune=now --aggressive
to massively clean up disk space.
Should you have more remote-tracking branches, run git switch <branchname> && git rebase
for them as well. If you have local-only branches, run git switch <branchname> && git rebase origin/dev
for each of them. (replacing origin with the name of the remote).
Migrating a fork to Git LFS
First, create a new fork in the Forgejo Web UI. After making sure you have git-lfs installed, clone the new fork into a new local directory, leaving the old repository untouched. Then, execute the following commands for every branch you want to copy:
cd /path/to/old/clone
git switch branchname
git format-patch <last shared commit hash> --stdout /path/to/tmp/folder/branchname.patch
cd /path/to/new/clone
git branch -c branchname
git switch branchname
git am /path/to/tmp/folder/branchname.patch
Then resolve any merge conflicts like you normally would. Once you're done, push your changes & delete the old repository in the Forgejo Web UI.
Copying this repository to a different server
When copying this repository and not using the "fork" function, use the following steps to ensure the LFS files are transferred as well.
- Before doing anything else, make sure you have a full copy of the LFS data by running
git lfs fetch --all
. - Now, add your new remote or change the url of the existing one, e.g. with
git remote set-url origin git@iceshrimp.dev:myusername/new-repo.git
- Next, run
git push origin --all
(swaporigin
for the name of your new remote if necessary) - Finally, copy the LFS files by running
git lfs push origin --all
, again swappingorigin
with the name of your new remote
And that's it! You should have a complete copy of this repository in your new location. To verify everything worked, check if the logo is displayed correctly in the README.
Migration
Things we had to do in preparation:
- Merge or close all open PRs as they are unable to be preserved (closed & merged ones are easily fixable, see below)
- Forcibly orphan all forks of the main repository with a note telling people to re-fork the repository after the migration and manually rebase their patches on the new repo
- Unprotect all branches
- Create the backup repo or enable PUSH_TO_CREATE
- Disable the creation of new forks and PRs during the migration
Example nginx config for the last point:
location /repo/fork/1 {
add_header Content-Type "text/plain" always;
return 503 'Forks are disabled while we are migrating to Git LFS';
}
location /iceshrimp/iceshrimp/compare {
add_header Content-Type "text/plain" always;
return 503 'PRs are disabled while we are migrating to Git LFS';
}
Here are the commands we used to migrate to git-lfs for future reference:
# Set up variables
folder=iceshrimp-lfs-migration
source=https://iceshrimp.dev/iceshrimp/iceshrimp.git
target=git@iceshrimp.dev:iceshrimp/iceshrimp
backup=git@iceshrimp.dev:iceshrimp/iceshrimp-pre-lfs-migration
# Clone the repo into a fresh directory
git clone --mirror "$source" "$folder"
cd "$folder"
# Backup the current state of the repository to a different repo
git remote add backup "$backup"
git push --mirror backup
git remote remove backup
# Save pre-migration rev-list
git rev-list --all > ~/rev-pre.txt
# Migrate all binary files to LFS
git lfs migrate import --include "*.zip,*.xcf,*.ai,group1-shard?of6,*.mp3,*.afdesign,*.blend,*.glb,*.psd,*.gz,*.woff2,*.enc,*.lockb,*.webp,*.png,*.jpg,*.ico,*.svg,*.gif" --everything
# Strip the now-invalid commit signatures
git remote remove origin
git filter-repo -f --replace-refs=update-no-add
git remote add target "$target"
# Resign commits we have the key for
FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --commit-filter 'if [[ "$GIT_COMMITTER_EMAIL" = "laura@hausmann.dev" ]] || [[ "$GIT_COMMITTER_EMAIL" = "zotan@zotan.pw" ]]; then git commit-tree -S "$@"; else git commit-tree "$@"; fi;' -- --all "$(git rev-list --all --committer="laura@hausmann.dev" --committer="zotan@zotan.pw" | tail -n1).."
# Clean up tree
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
# Optimize repo
git reflog expire --expire=now --all && git gc --prune=now --aggressive
# Save post-migration rev-list
git rev-list --all > ~/rev-post.txt
# Upload the rewritten history to the forge
git push --mirror target -f
# Build commit id mappings
echo 'map_hash_max_size 32768;' >> ~/rev-mapping-nginx-map.conf
echo 'map_hash_bucket_size 512;' >> ~/rev-mapping-nginx-map.conf
echo 'map $uri $new_uri {' >> ~/rev-mapping-nginx-map.conf
for (( i=0; i<${#rev_pre[@]}; i++ )); do
# Save the mapping for future reference
echo "${rev_pre[$i]} ${rev_post[$i]}" >> ~/rev-mapping.txt
# SQL commands to fix up all closed/merged PRs
echo "UPDATE \"pull_request\" SET \"merge_base\"='${rev_post[$i]}' WHERE \"merge_base\"='${rev_pre[$i]}' AND \"base_repo_id\" = 1;" >> ~/rev-mapping-pr.sql
# Redirect rules for nginx. We tried using rewrites at first, but at 26k rules the performance penalty is quite severe, especially when fetching LFS objects over HTTP.
echo -e "\\t\"/iceshrimp/iceshrimp/commit/${rev_pre[$i]}\" \"/iceshrimp/iceshrimp/commit/${rev_post[$i]}\";" >> ~/rev-mapping-nginx-map.conf
done
echo '}' >> ~/rev-mapping-nginx-map.conf
Note: We tried using replace refs of various varieties, however as of this Gitea commit, it (and by extension Forgejo) do not respect replace refs, so we had to settle for nginx redirects and updating the database manually.
To find issue comments with commit ids (paste into sudo -u postgres psql -d forgejo
):
SELECT regexp_matches("content", '(?<=[^0-9a-f/\-\"])[0-9a-f]{10}(?=[^0-9a-f/\-\"])', 'g') FROM comment;
SELECT regexp_matches("content", '(?<=[^0-9a-f/\-\"])[0-9a-f]{40}(?=[^0-9a-f/\-\"])', 'g') FROM comment;
SELECT regexp_matches("content_text", '(?<=[^0-9a-f/\-\"])[0-9a-f]{10}(?=[^0-9a-f/\-\"])', 'g') FROM issue_content_history;
SELECT regexp_matches("content_text", '(?<=[^0-9a-f/\-\"])[0-9a-f]{40}(?=[^0-9a-f/\-\"])', 'g') FROM issue_content_history;
SELECT regexp_matches("content", '(?<=[^0-9a-f/\-\"])[0-9a-f]{10}(?=[^0-9a-f/\-\"])', 'g') FROM issue;
SELECT regexp_matches("content", '(?<=[^0-9a-f/\-\"])[0-9a-f]{40}(?=[^0-9a-f/\-\"])', 'g') FROM issue;
Once you have a text file with the commit IDs, use a script like the following to fix them (modify the script as appropriate for each of the columns & tables mentioned above. If you get hits for abbreviated commit IDs, you'll have to adjust the WHERE
clause and regex_replace
search string as well):
readarray -t set < ~/set_issue_content.txt
readarray -t rev_pre < ~/rev-pre.txt
readarray -t rev_post < ~/rev-post.txt
for (( i=0; i<${#set[@]}; i++ )); do
idx=$(echo ${rev_pre[@]/${set[$i]}//} | cut -d/ -f1 | wc -w | tr -d ' ')
if [[ ${rev_post[$idx]} != '' ]]; then
echo "UPDATE \"issue\" SET \"content\" = regexp_replace(\"content\", '(?<=[^0-9a-f/\-\\\"])${set[$i]}(?=[^0-9a-f/\-\\\"])', '${rev_post[$idx]}', 'g') WHERE \"content\" ~ '(?<=[^0-9a-f/\-\\\"])${set[$i]}(?=[^0-9a-f/\-\\\"])';"
fi
done
Things we had to do after the migration:
- Run
git reflog expire --expire=now --all && git gc --prune=now --aggressive
in the forgejo repo directory to clean up all the old refs and actually get the repo size improvements
Things we missed:
- Disabling email notifications during the migration would've prevented a couple people from being spammed when rewritten commits mentioned issues they were subscribed to
- Disabling detection of "closes #issue-number" during the migration would've prevented a couple issues from being closed, but those were easily reopened after we noticed it.
As this process rewrites all history, we had to reset the cloned repo on every CI system, and manually fix up all the forks. This was a lot of work, but the efficiency improvements were absolutely worth it.