Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

TamilKavi: Release of Python Package & Dataset

Hi guys 👋

Today, I want to share something unexpected. To be honest, if someone had told me a month ago that I could do this, I wouldn’t have believed them. But here we are — I’ve finally released a Python package and dataset called TamilKavi. I still can’t believe I pulled it off, but it’s real!

I’d love to share the whole story with you. Many of you already know me — I write Tamil poetry and have even published two books. However, I faced font issues when trying to release them on Amazon and Kindle. Frustrated, I reached out to my community friend, Hari, and I asked them:
“Bro, I want to release my Tamil poetry book on Amazon, but I’m stuck with font issues. Do you know anyone who can solve it?”

Hari referred me to Ayyanar Bro, and to me it’s a surprise, he was from Madurai — what a coincidence! We spoke almost four times a week for different reasons. I had already written about him and his portfolio website, which he built using Emacs & Org, so I won’t go into more details — you guys might find it repetitive.

Through Ayyanar Bro, I learned about the Tamil Kanchilung community and FreeTamilBooks, where I finally found a solution to my font issue. But here’s another twist — FreeTamilBooks required more poetry for my book release. Because I like to release that in FreeTamilBooks.Then another book on Amazon. That was another headache because, with my tight schedule, I barely had time to write.

While navigating all this, I discovered Tamilrulepy, a Python package with Tamil grammar rules. I was eager to learn more, and unexpectedly, I got an opportunity to contribute to it! That’s when I met Boopalan — another passionate tech enthusiast like me. He helped me write code for TamilRulePy and even invited me to contribute to TamilString, a Python package for documentation. I accepted his invitation and started working on it.

Then, during one of our conversations, I got an idea: why not develop my own Python package? And that’s how TamilKavi was born.

I shared my idea with Boopalan and invited him to build it as a team because, honestly, I’m no expert. But it wasn’t easy — we had to overcome countless challenges, especially since we were both preparing for our model exams and semester exams (he’s an MSc student, and I’m a BSc student). It was a tough time, but I didn’t give up. I studied, understood, and gradually started coding — not entirely on my own, of course.

Now, you might wonder — why build a website? Simple: to collect data from authors. But due to financial constraints, the data collected through the website idea transformed into a Google Form, and now it is a navigation button. It’s another story altogether. Since I had no time, I built a basic structure using Lovable.dev and handed it over to my juniors, Gagan & Rohith, who took care of the website.

The final result? Release of the Python package & website!

I must especially thank Praveen Bro — my community brother and mentor. Without hesitation, he offered me a subdomain. For me, that’s a huge deal, and I’m incredibly grateful!

“Okay thambi, enough of this English talk — why did you release the dataset?” When you ask me likewise.

Well, there’s a reason for that, too. I’ve seen Selvakumar Duraipandian Bro on LinkedIn about their post of numerous Tamil datasets on Hugging Face, including Thirukkural, Tholkappiyam, and more. I was truly inspired by his work. So, I release that as a Dataset.

Now, you might ask, “So, thambi, after all this talk, what does your package actually do?”

It’s simple — TamilKavi helps discover new Tamil poems. That’s all. Now your mind is

Edhuka evalo seenu?

Well, I’m not just a developer. The person who is are Tamil poet & tech enthusiast, it’s a crazy project. Through this journey, I’ve learned so much, especially about GitHub workflows.

When you feel this content is valuable, follow me for more upcoming Blogs.

Connect with Me:

Learning Notes #71 – pyproject.toml

12 February 2025 at 16:57

In the evolving Python ecosystem, pyproject.toml has emerged as a pivotal configuration file, streamlining project management and enhancing interoperability across tools.

In this blog i delve deep into the significance, structure, and usage of pyproject.toml.

What is pyproject.toml?

Introduced in PEP 518, pyproject.toml is a standardized file format designed to specify build system requirements and manage project configurations. Its primary goal is to provide a unified, tool-agnostic approach to project setup, reducing the clutter of multiple configuration files.

Why Use pyproject.toml?

  • Standardization: Offers a consistent way to define project metadata, dependencies, and build tools.
  • Interoperability: Supported by various tools like Poetry, Flit, Black, isort, and even pip.
  • Simplification: Consolidates multiple configuration files (like setup.cfg, requirements.txt) into one.
  • Future-Proofing: As Python evolves, pyproject.toml is becoming the de facto standard for project configurations, ensuring compatibility with future tools and practices.

Structure of pyproject.toml

The pyproject.toml file uses the TOML format, which stands for “Tom’s Obvious, Minimal Language.” TOML is designed to be easy to read and write while being simple enough for parsing by tools.

1. [build-system]

Defines the build system requirements. Essential for tools like pip to know how to build the project.

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

requires: Lists the build dependencies required to build the project. These packages are installed in an isolated environment before the build process starts.

build-backend: Specifies the backend responsible for building the project. Common backends include:

  • setuptools.build_meta (for traditional Python projects)
  • flit_core.buildapi (for projects managed with Flit)
  • poetry.core.masonry.api (for Poetry projects)

2. [tool]

This section is used by third-party tools to store their configuration. Each tool manages its own sub-table under [tool].

Example with Black (Python code formatter):

[tool.black]
line-length = 88
target-version = ["py38"]
include = '\.pyi?$'
exclude = '''
/(
  \.git
  | \.mypy_cache
  | \.venv
  | build
  | dist
)/
'''

  • line-length: Sets the maximum line length for code formatting.
  • target-version: Specifies the Python versions the code should be compatible with.
  • include / exclude: Regular expressions to define which files Black should format.

Example with isort (import sorter)

[tool.isort]
profile = "black"
line_length = 88
multi_line_output = 3
include_trailing_comma = true

  • profile: Allows easy integration with formatting tools like Black.
  • multi_line_output: Controls how imports are wrapped.
  • include_trailing_comma: Ensures trailing commas in multi-line imports.

3. [project]

Introduced in PEP 621, this section standardizes project metadata, reducing reliance on setup.py.

[project]
name = "my-awesome-project"
version = "0.1.0"
description = "An awesome Python project"
readme = "README.md"
requires-python = ">=3.8"
authors = [
    { name="Syed Jafer K", email="syed@example.com" }
]
dependencies = [
    "requests>=2.25.1",
    "fastapi"
]
license = { file = "LICENSE" }
keywords = ["python", "awesome", "project"]
classifiers = [
    "Programming Language :: Python :: 3",
    "License :: OSI Approved :: MIT License",
    "Operating System :: OS Independent"
]

  • name, version, description: Basic project metadata.
  • readme: Path to the README file.
  • requires-python: Specifies compatible Python versions.
  • authors: List of project authors.
  • dependencies: Project dependencies.
  • license: Specifies the project’s license.
  • keywords: Helps with project discovery in package repositories.
  • classifiers: Provides metadata for tools like PyPI to categorize the project.

4. Optional scripts and entry-points

Define CLI commands:

[project.scripts]
mycli = "my_module:main"

  • scripts: Maps command-line scripts to Python functions, allowing users to run mycli directly after installing the package.

Tools That Support pyproject.toml

  • Build tools: Poetry, Flit, setuptools
  • Linters/Formatters: Black, isort, Ruff
  • Test frameworks: Pytest (via addopts)
  • Package managers: Pip (PEP 517/518 compliant)
  • Documentation tools: Sphinx

Migration Tips

  • Gradual Migration: Move one configuration at a time to avoid breaking changes.
  • Backwards Compatibility: Keep older config files during transition if needed.
  • Testing: Use CI pipelines to ensure the new configuration doesn’t break the build.

Troubleshooting Common Issues

  1. Build Failures with Pip: Ensure build-system.requires includes all necessary build tools.
  2. Incompatible Tools: Check for the latest versions of tools to ensure pyproject.toml support.
  3. Configuration Errors: Validate your TOML file with online validators like TOML Lint.

Further Reading:

❌
❌