Skip to content

Augnik03/ResearchAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AI-Powered Search Engine for Research Papers (Project for NLP)

Overview

This project aims to build an intelligent search engine that understands the semantic meaning of research queries and ranks documents based on abstract importance and key concept extraction.

Concepts Used

  • Sentence Transformers – Understanding query semantics
  • LSTM, RNN – Document ranking based on abstract importance
  • CBOW Embeddings – Extracting key concepts from papers
  • Named Entity Recognition (NER) – Extracting citations, authors, and key entities

Datasets

Features

Semantic Search: Uses Sentence Transformers to improve query understanding
Intelligent Ranking: LSTM-based ranking of research papers
Concept Extraction: CBOW embeddings identify key topics
Entity Recognition: NER extracts authors, citations, and key entities

🛠 Tech Stack

  • Language Models: BERT, Sentence Transformers
  • Deep Learning: LSTM, RNN, PyTorch/TensorFlow
  • NLP Techniques: Named Entity Recognition (NER), Word Embeddings (CBOW)
  • Database: PostgreSQL / MongoDB for storing research papers
  • Backend: FastAPI / Flask
  • Frontend: Next.js, Typescript, TailwindCSS

General Instructions before installing locally

  • Make sure Nodejs and python are installed before following the next steps
  • To check for Nodejs and Python run these commands on command prompt
    node -v
    python --version
  • If any problem understanding the folder structure ask me
  • arxiv_metadata.json was put in .gitignore because it was 4GB's big download it locally and put it in search_engine/data/raw folder

Installation

  1. Clone the repository:

    git clone https://github.com/Augnik03/ResearchAI.git
    cd ResearchAI
  2. For working on the frontend:

    cd frontend
    npm i
  • If error occurs run this command:
    npm i --legacy-peer-deps or npm i --force
  • To run development server:
    npm run dev
  1. For working on the backend:
    cd search_engine
  • Before installing python dependencies make sure to create a virtual environment:
    python -m venv venv
    venv/Scripts/activate
  1. Install dependencies:

    pip install -r requirements.txt
  2. To run scripts on the dataset:

    python preprocess.py

Contributing

Contributions are welcome! Fork the repo, create a feature branch, and submit a PR.

About

project for nlp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages