Patexia Lead Data Engineer

Req. No.: PA1050
Title: Lead Data Engineer
Type: Full Time
Location: LATAM/EU (remote)

Patexia, a forward-thinking technology company specializing in intellectual property and patent solutions, is in search of a Lead Data Engineer to join our remote team. As a Lead Data Engineer, you will drive the utilization of advanced NLP tools and AI technologies to solve intricate challenges in entity resolution and patent data analysis.


Key Responsibilities:

  • Spearhead prompt engineering initiatives, leveraging OpenAI's ChatGPT, to address complex challenges in text processing effectively.
  • Utilize data engineering expertise, focusing on Google Cloud Platform (GCP) tools like BigQuery (BQ) and Google Cloud Storage (GCS), to ensure efficient data storage and retrieval.
  • Lead cross-functional teams in understanding project requirements, devising innovative solutions using NLP tools, and ensuring project success.
  • Seamlessly transition between Data Scientist, Data Analyst, and Data Engineer roles based on project demands.
  • Manage databases (SQL/NoSQL) for data management purposes and undertake essential data engineering tasks as required.
  • Articulate technical findings clearly and effectively communicate complex concepts to non-technical stakeholders.
  • Stay updated on the latest advancements in NLP technologies, evaluate their relevance to ongoing projects, and drive continuous innovation.
  • Document solutions comprehensively and actively contribute to knowledge sharing within the team.


Qualifications & Skills:


  • Proven experience in prompt engineering, with a strong emphasis on OpenAI's ChatGPT.
  • Proficiency in data engineering demonstrated through hands-on experience with GCP tools (BQ, GCS).
  • Solid understanding of NLP concepts and methodologies.
  • Ability to transition between Data Scientist, Data Analyst, and Data Engineer roles seamlessly.
  • Strong analytical abilities and adept problem-solving skills.
  • Excellent written and verbal communication skills.
  • Self-motivated and capable of working both independently and collaboratively within a remote team environment.


  • Familiarity with advanced NLP techniques such as Regression Analysis, BERT, LLaMa, Transformers, and LSTM.
  • Experience with BigQuery Machine Learning (BQ ML) and vector databases.
  • Knowledge of Google Dataflow and Google Dataproc for data processing tasks.


We are particularly interested in candidates who have experience leading engineering teams and a track record of successfully implementing solutions on projects similar to our tech stack: React, Next.js, TypeScript, PHP, Python, and GCP.