Posts Tagged ‘blogger’
Quick and dirty script to convert WordPress export file to Blogger / Atom XML
I’ve created a Python script that converts WordPress export files to Blogger/Atom XML format. Here’s how to use it:
The script takes two command-line arguments:
wordpress_export.xml
: Path to your WordPress export XML fileblogger_export.xml
: Path where you want to save the converted Blogger/Atom XML file
To run the script:
python wordpress_to_blogger.py wordpress_export.xml blogger_export.xml
The script performs the following conversions:
- Converts WordPress posts to Atom feed entries
- Preserves post titles, content, publication dates, and authors
- Maintains categories as Atom categories
- Handles post status (published/draft)
- Preserves HTML content formatting
- Converts dates to ISO format required by Atom
The script uses Python’s built-in xml.etree.ElementTree
module for XML processing and includes error handling to make it robust.
Some important notes:
- The script only converts posts (not pages or other content types)
- It preserves the HTML content of your posts
- It maintains the original publication dates
- It handles both published and draft posts
- The output is a valid Atom XML feed that Blogger can import
The file:
#!/usr/bin/env python3 import xml.etree.ElementTree as ET import sys import argparse from datetime import datetime import re def convert_wordpress_to_blogger(wordpress_file, output_file): # Parse WordPress XML tree = ET.parse(wordpress_file) root = tree.getroot() # Create Atom feed atom = ET.Element('feed', { 'xmlns': 'http://www.w3.org/2005/Atom', 'xmlns:app': 'http://www.w3.org/2007/app', 'xmlns:thr': 'http://purl.org/syndication/thread/1.0' }) # Add feed metadata title = ET.SubElement(atom, 'title') title.text = 'Blog Posts' updated = ET.SubElement(atom, 'updated') updated.text = datetime.now().isoformat() # Process each post for item in root.findall('.//item'): if item.find('wp:post_type', {'wp': 'http://wordpress.org/export/1.2/'}).text != 'post': continue entry = ET.SubElement(atom, 'entry') # Title title = ET.SubElement(entry, 'title') title.text = item.find('title').text # Content content = ET.SubElement(entry, 'content', {'type': 'html'}) content.text = item.find('content:encoded', {'content': 'http://purl.org/rss/1.0/modules/content/'}).text # Publication date pub_date = item.find('pubDate').text published = ET.SubElement(entry, 'published') published.text = datetime.strptime(pub_date, '%a, %d %b %Y %H:%M:%S %z').isoformat() # Author author = ET.SubElement(entry, 'author') name = ET.SubElement(author, 'name') name.text = item.find('dc:creator', {'dc': 'http://purl.org/dc/elements/1.1/'}).text # Categories for category in item.findall('category'): category_elem = ET.SubElement(entry, 'category', {'term': category.text}) # Status status = item.find('wp:status', {'wp': 'http://wordpress.org/export/1.2/'}).text if status == 'publish': app_control = ET.SubElement(entry, 'app:control', {'xmlns:app': 'http://www.w3.org/2007/app'}) app_draft = ET.SubElement(app_control, 'app:draft') app_draft.text = 'no' else: app_control = ET.SubElement(entry, 'app:control', {'xmlns:app': 'http://www.w3.org/2007/app'}) app_draft = ET.SubElement(app_control, 'app:draft') app_draft.text = 'yes' # Write the output file tree = ET.ElementTree(atom) tree.write(output_file, encoding='utf-8', xml_declaration=True) def main(): parser = argparse.ArgumentParser(description='Convert WordPress export to Blogger/Atom XML format') parser.add_argument('wordpress_file', help='Path to WordPress export XML file') parser.add_argument('output_file', help='Path to output Blogger/Atom XML file') args = parser.parse_args() try: convert_wordpress_to_blogger(args.wordpress_file, args.output_file) print(f"Successfully converted {args.wordpress_file} to {args.output_file}") except Exception as e: print(f"Error: {str(e)}") sys.exit(1) if __name__ == '__main__': main()