AI Collaboration
12 min

The 'GitHub is Enough' Pitfall in AI-Driven Development
- The Importance of Backup Strategy

A real case of 38 articles data corruption in AI collaborative development. Learn from our experience why GitHub couldn't prevent it and how local backups became our savior, leading to a new backup strategy for the AI era.

AI協働開発バックアップGitHubデータ破損ベストプラクティスClaudeトラブルシューティング

Introduction: A Morning Nightmare

On the morning of June 17, 2025, we faced a shocking reality. Out of 51 news articles, 38 had completely incorrect content.

  • An article about a "sleep observation app" became "internal study session for Tokyo client"
  • An article about "haptic feedback technology" became "content creation service"
  • Multiple different articles were overwritten with identical content

Even more puzzling, this issue had supposedly been fixed the previous day. However, even the fix history had vanished.

Why Did This Happen Despite Having GitHub?

Many developers think "GitHub is enough." We thought so too. However, this incident taught us that GitHub alone is insufficient.

GitHub's Limitations

1. Uncommitted Data Is Not Protected

bash
# Problem occurs during work
$ node scripts/migrate-data.js  # Buggy script
# → Many files corrupted
# → Not committed yet = GitHub doesn't have correct data

2. Incorrect Changes Become "Correct History" Once Committed

bash
$ git add -A
$ git commit -m "feat: Data migration complete"  # Actually corrupted data
$ git push
# → Corrupted data recorded as "correct state" on GitHub

3. AI-Specific Issue: Memory Loss Between Sessions

AI (Claude, ChatGPT, etc.) forgets previous work in new sessions. Therefore: - May repeat the same problems - Fix methods are lost if not documented

The Savior: Local Backups

What saved us was an accidentally preserved news.json.backup file.

javascript
// scripts/fix-wordpress-news.js
const backupData = JSON.parse(
  await fs.readFile('../public/data/news.json.backup', 'utf8')
);

// Restore correct data from backup
for (const article of wpNewsArticles) {
  // Regenerate articles with correct content
  const correctData = backupData[article.id];
  await fs.writeFile(filePath, JSON.stringify(correctData, null, 2));
}

Practical Backup Strategy

1. Mandatory Backup Before Data Structure Changes

bash
#!/bin/bash
# scripts/pre-migration-backup.sh

# Backup with timestamp
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
cp -r public/data "backups/data_${TIMESTAMP}"

echo "✅ Backup created: backups/data_${TIMESTAMP}"
echo "📝 Reason: Before major data structure change" >> backups/backup.log

2. Ensuring Restorability

javascript
// Prepare restore script alongside backup
const createBackupWithRestoreScript = async (dataPath) => {
  const timestamp = new Date().toISOString();
  const backupPath = `${dataPath}.backup-${timestamp}`;
  
  // Create backup
  await fs.copyFile(dataPath, backupPath);
  
  // Generate restore script
  const restoreScript = `
#!/bin/bash
# Restore script for ${dataPath}
# Created: ${timestamp}

cp "${backupPath}" "${dataPath}"
echo "✅ Restored from ${backupPath}"
  `;
  
  await fs.writeFile(`restore-${timestamp}.sh`, restoreScript);
};

3. Essential Documentation for AI Collaboration

markdown
## Data Corruption Response Procedures

### When News Articles Are Corrupted
1. Restore from backup
bash node scripts/fix-wordpress-news.js
2. Rebuild index
bash node scripts/rebuild-news-index.js
### Files Used
- Backup: `/public/data/news.json.backup`
- Translation data: `/i18n/locales/ja/news.json`

Lessons Learned: The Importance of Multi-Layer Defense

1. The 3-2-1 Backup Rule

- Keep 3 copies (original + 2 backups) - Store on 2 different media (local + cloud) - Keep 1 copy offsite (GitHub or cloud storage)

2. AI Collaboration-Specific Measures

Strengthened Documentation Rules

markdown
### Protecting Important Fix History
- Always record data corruption fixes in CHANGELOG.md
- Document commands used
- Record backup file locations

Deletion Precautions

markdown
### Checks When Deleting Duplicate Files
1. Compare content, check for important records
2. grep for keywords (fix, repair, restore)
3. Merge different content before deletion

3. Tools to Implement

javascript
// scripts/backup-guard.js
// Automatically create backup before data changes

const guardedOperation = async (operation, dataPath) => {
  // 1. Automatic backup
  const backupPath = await createBackup(dataPath);
  
  try {
    // 2. Execute operation
    await operation();
    
    // 3. Integrity check
    const isValid = await validateData(dataPath);
    if (!isValid) {
      throw new Error('Data validation failed');
    }
    
  } catch (error) {
    // 4. Auto-restore on problems
    console.error('Operation failed, restoring backup...');
    await restoreBackup(backupPath, dataPath);
    throw error;
  }
};

Conclusion: "An Ounce of Prevention"

This incident serves as a warning against the overconfidence of "GitHub is enough" that modern developers tend to have.

    Especially in AI-driven development:
  • AI's memory loss between sessions
  • Expanded impact from massive automated changes
  • Changes at a pace beyond human review capacity

Against these risks, local backups are the last line of defense.

Action Items

  1. Implement Now
  2. - Create backup scripts for important data - Add backup directory to .gitignore - Share backup policy with team
  1. Continuous Improvement
  2. - Make pre/post data change backups a habit - Document and test restore procedures - Prepare clear instructions for AI

The shift from "GitHub is enough" to "Both GitHub and backups make it safe" is essential for development in the AI era.

References