Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/add grobid #22

Merged
merged 5 commits into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions backend/app.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ const indexRouter = require('./routes/index');
const usersRouter = require('./routes/users');
const testAPIRouter = require("./routes/testAPI");
const databaseRouter = require("./routes/database");
const extractRouter = require("./routes/extract");
const bodyParser = require('body-parser');
const config = require("./config.json");

Expand Down Expand Up @@ -40,6 +41,8 @@ app.use('/', indexRouter);
app.use('/users', usersRouter);
app.use("/testAPI", testAPIRouter);
app.use("/database", databaseRouter);
app.use("/extract", extractRouter);


// catch 404 and forward to error handler
app.use(function (req, res, next) {
Expand Down
168 changes: 167 additions & 1 deletion backend/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion backend/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,21 @@
"dev-docker": "nodemon --legacy-watch ./bin/www"
},
"dependencies": {
"axios": "^1.6.8",
"body-parser": "^1.20.2",
"cookie-parser": "~1.4.4",
"cors": "^2.8.5",
"debug": "~2.6.9",
"express": "~4.18.2",
"form-data": "^4.0.0",
"http-errors": "~1.6.3",
"jade": "^1.9.2",
"morgan": "~1.10.0",
"multer": "^1.4.5-lts.1",
"pg": "^8.8.0",
"wink-eng-lite-web-model": "^1.5.2",
"wink-nlp": "^1.14.3"
"wink-nlp": "^1.14.3",
"xml2js": "^0.6.2"
},
"devDependencies": {
"nodemon": "^3.0.2"
Expand Down
47 changes: 47 additions & 0 deletions backend/routes/extract.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
const express = require('express');
const axios = require('axios');
const FormData = require('form-data');
const multer = require('multer'); // Import multer
const xml2js = require('xml2js');
const router = express.Router();

const upload = multer({ storage: multer.memoryStorage() });

router.post("/", upload.single('file'), async (req, res) => {
// Ensure a file is actually provided
if (!req.file) {
return res.status(400).send('No file uploaded.');
}

const formData = new FormData();
formData.append('input', req.file.buffer, {
filename: req.file.originalname,
contentType: req.file.mimetype,
knownLength: req.file.size,
});

try {
const grobidResponse = await axios.post('http://grobid:8070/api/processFulltextDocument', formData, {
headers: {
...formData.getHeaders(),
},
responseType: 'text', // Changed to 'text' to handle XML response correctly
});

// Convert XML response to JSON
xml2js.parseString(grobidResponse.data, (err, result) => {
if (err) {
console.error('Error parsing XML: ', err);
return res.status(500).send('Error parsing XML response');
}

res.json(result);
});

} catch (error) {
console.error('Error when calling GROBID: ', error);
res.status(500).send('Error when processing the PDF file');
}
});

module.exports = router;
2 changes: 1 addition & 1 deletion docker-compose.dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ services:
- backend

grobid:
image: lfoppiano/grobid:0.7.0
image: lfoppiano/grobid:0.8.0
ports:
- "8070:8070"
- "8071:8071"
5 changes: 1 addition & 4 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,7 @@ services:
- backend

grobid:
image: lfoppiano/grobid:0.7.0
ports:
- "8070:8070"
- "8071:8071"
image: lfoppiano/grobid:0.8.0

volumes:
postgres_data:
Loading