N
AI

Python - AI Scraping engineer

N of 1 inc · EG · $34k

Actively hiring Posted 1 day ago

Job description:

We are looking for a highly skilled Python Engineer with deep experience building large-scale web scraping pipelines and AI-powered data processing systems. This role is focused on extracting, normalizing, and enriching large volumes of structured and unstructured data using Python, LLMs (e.g., OpenAI), and AWS-based containerized infrastructure.

You will own the end-to-end lifecycle of data ingestion: from scraping and document processing, through AI-driven enrichment and classification, to deployment in scalable cloud environments.

Key Responsibilities

  • Design, build, and maintain high-reliability Python scraping systems for collecting data from complex, dynamic, and unstructured web sources (HTML, PDFs, APIs, documents).
  • Implement AI-assisted extraction, classification, summarization, and normalization pipelines using large language models (e.g., OpenAI).
  • Develop resilient scraping architectures with rate-limiting, retries, proxy management, CAPTCHA handling, and change detection.
  • Build data processing pipelines that clean, transform, deduplicate, and enrich scraped content for downstream analytics and ML workflows.
  • Develop and maintain containerized Python services using Docker and deploy them at scale via AWS ECS and related services.
  • Integrate LLMs into automated workflows for document parsing, entity extraction, taxonomy mapping, and insight generation.
  • Design and expose internal APIs for triggering scraping jobs, processing data, and retrieving AI-generated outputs.
  • Manage cloud resources across AWS (ECS, S3, Lambda, RDS, CloudWatch) with a focus on scalability, reliability, and cost efficiency.
  • Optimize scraping and AI pipelines for performance, throughput, and fault tolerance.
  • Implement monitoring, logging, and alerting for long-running scraping and AI workloads.
  • Write clear technical documentation covering scraping logic, AI workflows, and deployment patterns.

Qualifications

  • Strong Python engineering background with a focus on data ingestion and scraping systems.
  • Extensive experience building web scrapers using tools such as BeautifulSoup, Scrapy, Playwright, Selenium, or similar frameworks.
  • Hands-on experience integrating LLM APIs (e.g., OpenAI) into production systems.
  • Proven ability to handle unstructured data (HTML, PDFs, text blobs) and convert it into structured outputs.
  • Experience containerizing Python applications with Docker and deploying them using AWS ECS.
  • Solid understanding of AWS services including S3, ECS, Lambda, RDS, and CloudWatch.
  • Experience designing and consuming RESTful APIs.
  • Familiarity with CI/CD pipelines, Git-based workflows, and automated testing.
  • Strong grasp of software engineering best practices: modularity, observability, error handling, and performance optimization.
  • Ability to work independently in ambiguous problem spaces and iterate quickly.
  • Clear written and verbal communication skills, especially around complex technical systems.

Pay: E£27,500.00 - E£35,000.00 per month

Work Location: Remote

Tags & focus areas

Used for matching and alerts on DevFound
Fulltime Remote Ai Generative Ai

Next step

Ready to Join the Team?

Apply once with DevFound. We'll route your profile to N of 1 inc and keep you informed when matching AI roles go live.

  • Single profile, multiple curated AI opportunities
  • No spam roles — only vetted AI positions
  • You choose which roles to apply to
Sign up to apply

No CV uploads. We never share your profile without your consent.