Reply to Moving the pfSense® Documentation to GitHub on Wed, 06 Jun 2018 17:54:26 GMT

I used a php script to grab all the wiki pages from the DB and save them as flat files (note: I forgot to add the title at the top of each page in this version):

<?php
$servername = "localhost";
$username = "user";
$password = "pa$$";
$dbname = "mediaWiki";

function slugify($text)
{
  // replace non letter or digits by -
  $text = preg_replace('~[^\pL\d]+~u', '-', $text);

  // remove unwanted characters
  $text = preg_replace('~[^-\w]+~', '', $text);

  // trim
  $text = trim($text, '-');

  // remove duplicate -
  $text = preg_replace('~-+~', '-', $text);

  // lowercase
  $text = strtolower($text);

  if (empty($text)) {
    return 'n-a';
  }

  return $text;
}

// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);

// Check connection
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
} 

$sql = "SELECT page_title, page_touched, old_text FROM revision,page,text WHERE revision.rev_id=page.page_latest AND text.old_id=revision.rev_text_id AND page.page_namespace=0 AND substring(text.old_text,2,8) NOT IN ('REDIRECT')";
$result = $conn->query($sql);

if ($result->num_rows > 0) {
    // output data of each row
    while($row = $result->fetch_assoc()) {
        $myfile = fopen(slugify($row["page_title"]).".mw", "w") or die("Unable to open file!");
        fwrite($myfile, $row["old_text"]);
        fclose($myfile);
    }
} else {
    echo "0 results";
}
$conn->close();

?>

I also found some commands on StackOverflow to download all the images as a zip file.

I used pandoc to convert the mediaWiki syntax to RST syntax (you could do markdown or whatever here and go in a different direction):

files=($(find . -type f -name '*.mw'))
for item in ${files[*]}
do
  filename=${item##*/}
  #printf "   %s\n" $filename
  
  pandoc $filename -f mediawiki -t rst -o ./output/${filename%.*}.rst || {  printf "   %s conversion failed\n" $filename ; }
done

Then massaged all that into the desired sphinx formatting that I wanted...it took several custom python/bash scripts to clean up the pandoc conversion (it isn't perfect).

Then I built the sphinx docs as HTML and ran the npm package broken-link-checker-local against it to check for broken links (more python scripts involved to fix them).

I also used git as a backup so I could git checkout if my scripts blew anything up along the way.

That's about all the advise I can offer... It's a lot of work, but worth it in the end. Good Luck!

Reply to Moving the pfSense® Documentation to GitHub on Wed, 06 Jun 2018 17:54:26 GMT

Trending Articles

Mp3 Download: Mdu - Mazola

SAHARA FLASH LIVE IN WERAGOLLA 2018-04-20

The 10 Tennessee Cities With The Largest Black Population For 2021

Summary of The Schoolboy by William Blake

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

Download – The Last Ship 1ª Temporada RMVB Dublado – MEGA

Nalgonda District Police Office Mobile Numbers List in Telangana State

99 God Status for Whatsapp, Facebook

Storage DRS Fault won't clear

Moondru Mudichu 20-07-2016 – Polimer tv Serial

Loughborough robbers threatened victim with castration if he went to police

ОЧІ В ОЧІ – Синоніми – Single [iTunes Plus M4A]

Essex Police seek Harlow man Joel Steadman

Download EFF Album: 12 –“ASINAMALI”

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

ZARIA CUMMINGS

Cheltenham man avoids prison after glassing girlfriend

VIDEO2BRAIN - GETTING STARTED WITH ILLUSTRATOR CS6

QUIZ: Are You Smart Enough To Be A US Marine?