MyBB Community Forums

Full Version: Parsing HTML code to BBCode in Database
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am trying to retrieve questions from Stackoverflow API and then insert it to my database, making it look as a new thread created. The API sends back the 'body' data in HTML and MyBB won't parse HTML when inserted to "mybb_posts" ->  "message" row.

Here's an example of json data that the API sends back:

{
  "items": [
    {
      "tags": [
        "java",
        "c++",
        "performance",
        "optimization",
        "branch-prediction"
      ],
      "owner": {
        "reputation": 240497,
        "user_id": 87234,
        "user_type": "registered",
        "accept_rate": 100,
        "profile_image": "https://i.stack.imgur.com/FkjBe.png?s=128&g=1",
        "display_name": "GManNickG",
        "link": "http://stackoverflow.com/users/87234/gmannickg"
      },
      "is_answered": true,
      "view_count": 919722,
      "protected_date": 1399067470,
      "accepted_answer_id": 11227902,
      "answer_count": 14,
      "score": 16772,
      "last_activity_date": 1484676151,
      "creation_date": 1340805096,
      "last_edit_date": 1478049098,
      "question_id": 11227809,
      "link": "http://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array",
      "title": "Why is it faster to process a sorted array than an unsorted array?",
      "body": "<p>Here is a piece of C++ code that seems very peculiar. For some strange reason, sorting the data miraculously makes the code almost six times faster.</p>\n\n<pre class=\"lang-cpp prettyprint-override\"><code>#include &lt;algorithm&gt;\n#include &lt;ctime&gt;\n#include &lt;iostream&gt;\n\nint main()\n{\n    // Generate data\n    const unsigned arraySize = 32768;\n    int data[arraySize];\n\n    for (unsigned c = 0; c &lt; arraySize; ++c)\n        data[c] = std::rand() % 256;\n\n    // !!! With this, the next loop runs faster\n    std::sort(data, data + arraySize);\n\n    // Test\n    clock_t start = clock();\n    long long sum = 0;\n\n    for (unsigned i = 0; i &lt; 100000; ++i)\n    {\n        // Primary loop\n        for (unsigned c = 0; c &lt; arraySize; ++c)\n        {\n            if (data[c] &gt;= 128)\n                sum += data[c];\n        }\n    }\n\n    double elapsedTime = static_cast&lt;double&gt;(clock() - start) / CLOCKS_PER_SEC;\n\n    std::cout &lt;&lt; elapsedTime &lt;&lt; std::endl;\n    std::cout &lt;&lt; \"sum = \" &lt;&lt; sum &lt;&lt; std::endl;\n}\n</code></pre>\n\n<ul>\n<li>Without <code>std::sort(data, data + arraySize);</code>, the code runs in 11.54 seconds.</li>\n<li>With the sorted data, the code runs in 1.93 seconds.</li>\n</ul>\n\n<p>Initially, I thought this might be just a language or compiler anomaly. So I tried it in Java.</p>\n\n<pre class=\"lang-java prettyprint-override\"><code>import java.util.Arrays;\nimport java.util.Random;\n\npublic class Main\n{\n    public static void main(String[] args)\n    {\n        // Generate data\n        int arraySize = 32768;\n        int data[] = new int[arraySize];\n\n        Random rnd = new Random(0);\n        for (int c = 0; c &lt; arraySize; ++c)\n            data[c] = rnd.nextInt() % 256;\n\n        // !!! With this, the next loop runs faster\n        Arrays.sort(data);\n\n        // Test\n        long start = System.nanoTime();\n        long sum = 0;\n\n        for (int i = 0; i &lt; 100000; ++i)\n        {\n            // Primary loop\n            for (int c = 0; c &lt; arraySize; ++c)\n            {\n                if (data[c] &gt;= 128)\n                    sum += data[c];\n            }\n        }\n\n        System.out.println((System.nanoTime() - start) / 1000000000.0);\n        System.out.println(\"sum = \" + sum);\n    }\n}\n</code></pre>\n\n<p>With a somewhat similar but less extreme result.</p>\n\n<hr>\n\n<p>My first thought was that sorting brings the data into the cache, but then I thought how silly that is because the array was just generated.</p>\n\n<ul>\n<li>What is going on?</li>\n<li>Why is a sorted array faster to process than an unsorted array?</li>\n<li>The code is summing up some independent terms, and the order should not matter.</li>\n</ul>\n"
    }
  ]


I need to parse the body HTML codes to MyBB format and then insert the data into message row. Is there any way to do this? Sorry if this has been already answered but I was unable to find a solution anywhere.

Thanks!

Would this work? :confused:

$arrFrom = array("<p>","<b>","<code>","</p>", "</b>", "</code>", '<pre class="lang-cpp prettyprint-override">', "</pre>", "\n", "<li>", "</li>", "<ul>", "</ul>", '<pre class="lang-java prettyprint-override">', "<hr>");
$arrTo = array("[p]","[b]","{code]","[/p]", "[/b]", "{/code]", "", "", "<br>", "[li]", "[/li]", "[ul]", "[/ul]", "", "[hr]"); 

$code = "<p>Here is a piece of C++ code that seems very peculiar. For some strange reason, sorting the data miraculously makes the code almost six times faster.</p>\n\n<pre class=\"lang-cpp prettyprint-override\"><code>#include &lt;algorithm&gt;\n#include &lt;ctime&gt;\n#include &lt;iostream&gt;\n\nint main()\n{\n    // Generate data\n    const unsigned arraySize = 32768;\n    int data[arraySize];\n\n    for (unsigned c = 0; c &lt; arraySize; ++c)\n        data[c] = std::rand() % 256;\n\n    // !!! With this, the next loop runs faster\n    std::sort(data, data + arraySize);\n\n    // Test\n    clock_t start = clock();\n    long long sum = 0;\n\n    for (unsigned i = 0; i &lt; 100000; ++i)\n    {\n        // Primary loop\n        for (unsigned c = 0; c &lt; arraySize; ++c)\n        {\n            if (data[c] &gt;= 128)\n                sum += data[c];\n        }\n    }\n\n    double elapsedTime = static_cast&lt;double&gt;(clock() - start) / CLOCKS_PER_SEC;\n\n    std::cout &lt;&lt; elapsedTime &lt;&lt; std::endl;\n    std::cout &lt;&lt; \"sum = \" &lt;&lt; sum &lt;&lt; std::endl;\n}\n</code></pre>\n\n<ul>\n<li>Without <code>std::sort(data, data + arraySize);</code>, the code runs in 11.54 seconds.</li>\n<li>With the sorted data, the code runs in 1.93 seconds.</li>\n</ul>\n\n<p>Initially, I thought this might be just a language or compiler anomaly. So I tried it in Java.</p>\n\n<pre class=\"lang-java prettyprint-override\"><code>import java.util.Arrays;\nimport java.util.Random;\n\npublic class Main\n{\n    public static void main(String[] args)\n    {\n        // Generate data\n        int arraySize = 32768;\n        int data[] = new int[arraySize];\n\n        Random rnd = new Random(0);\n        for (int c = 0; c &lt; arraySize; ++c)\n            data[c] = rnd.nextInt() % 256;\n\n        // !!! With this, the next loop runs faster\n        Arrays.sort(data);\n\n        // Test\n        long start = System.nanoTime();\n        long sum = 0;\n\n        for (int i = 0; i &lt; 100000; ++i)\n        {\n            // Primary loop\n            for (int c = 0; c &lt; arraySize; ++c)\n            {\n                if (data[c] &gt;= 128)\n                    sum += data[c];\n            }\n        }\n\n        System.out.println((System.nanoTime() - start) / 1000000000.0);\n        System.out.println(\"sum = \" + sum);\n    }\n}\n</code></pre>\n\n<p>With a somewhat similar but less extreme result.</p>\n\n<hr>\n\n<p>My first thought was that sorting brings the data into the cache, but then I thought how silly that is because the array was just generated.</p>\n\n<ul>\n<li>What is going on?</li>\n<li>Why is a sorted array faster to process than an unsorted array?</li>\n<li>The code is summing up some independent terms, and the order should not matter.</li>\n</ul>\n";

echo str_replace($arrFrom, $arrTo, $code);



Nevermind, figured it out. The above code does work.