Parser is now stable!

Uncategorized

Sep 1

Written By

We did it!

The newest version of Parser / Parsing Library works with 100% accuracy across the pdfs. Not just physics, Maths, or computer science but every A-level pdfs. There are still some rooms for improvement which I will do tomorrow. But I guess now I will gradually start shifting my focus from OpenPastPaper towards the hackathon I am currently attending.

Anyway, it can classify all the questions -- along with sub-questions -- in any pdfs. Sub-questions can be important if we are going to separate them as each sub-question can be a part of a different chapter which will be interesting to see. I am still skeptical about the reliability of sub-question extraction and the use case for now, but I may just get rid of sub-question extraction in the future if it deems too time-consuming.

Bug -- what is this stupid bug.

The problem is about a dictionary that stores some metadata about its characters. But when appending to an array, the entire array seems to have a single copy of the metadata everywhere -- which is weird like this is not supposed to happen but now I cannot even pinpoint the bug.

Update -- The Bug has been solved!

Parser is now stable!

A lot of dilemmas

Accurate data localization and extraction