You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wikiextractor seems to have bugs and will limit us to python-3.10 or less when building the index.
Can we replace wikiextractor by either fixing the bug and using that version? Alternatively, we can look at more maintained codebases and see if they have better support
The text was updated successfully, but these errors were encountered:
import mwparserfromhell
mediawiki_text = """
== Section 1 ==
This is some [[content]] in [[link|section 1]].
== {{Section 2}} ==
This is some content in section 2.
"""
ans = mwparserfromhell.parse(mediawiki_text).strip_code().strip()
#'Section 1 \nThis is some content in section 1.\n\n \nThis is some content in section 2.'
Code from mwparserfromhell, a library that's still under development.
Wikiextractor seems to have bugs and will limit us to python-3.10 or less when building the index.
Can we replace wikiextractor by either fixing the bug and using that version? Alternatively, we can look at more maintained codebases and see if they have better support
The text was updated successfully, but these errors were encountered: