Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Welcoming Winter with Python – PyKids and PyLadies

31 January 2025 at 21:30

Winter Break for kids

In Canada, we have around 15 days of winter break for all school kids, covering Christmas and New year.

These celebrations are helping much to come out of the winter worries.

Winter is scary word, but people have to go through it, as life has to go on. As we can not travel much and there are no outdoor events/games, we have to be at home, all the days, weeks and months. Organizing indoor events are costly.

To spend the winter actively, many celebrations days are occurring. Halloween, Christmas, Boxing day, New year, Valentine day and more are there, to make the winter active.

Keeping the kids at home for 17 days winter break is tough. We have to engage them whole day. In our apartment, we are conducting many kids events like weekly chess hour, dance hours, board games day, movie time, sleep over nights etc.

Computer Literacy is good here. Kids are learning to use computer at school, from Grade 3 itself. They play many educational games at school. Homework are done with google slides and google docs, from grade 5. Scratch programming also trained here at grade 5. So, they know very well to use computer, read text online, search the internet and gather some info etc.

PyKids

This time, thought of having some tech events for kids. Called for 10 days training as “PyKids”, for grade 5 and above. The announcement was welcomed well by many parents. We had around 17 kids participated.

As our house is empty mostly, ( thanks to Nithya, for the minimalistic life style ), our hall helped for gathering and teaching.

By keeping the hall empty, we are using the place as Daily Zumba dance hall, mini party hall, DJ hall, kids play area and now as a learning place.

Teaching Python for kids is not easy. The kids are not ready to listen to any long talks. They can not even listen to my regular “python introduction” slides. So, jumped into hands-on on the day one itself.

My mentor, Asokan Pichai explained how we have to goto hands-on on any python training, few months ago. Experienced the benefits of it this time.

Even-though, I am using Python for 10+ years, teaching it to kids was really tough. I had to read few books and read on more basics, so that I can explain the building blocks of python with more relevant examples for kids.

The kids are good at asking questions. They share feedback with their eyes itself. It is a huge different on teaching to adults. Most of the adults don’t ask questions. They hesitate to say they don’t understand something. But, kids are brave enough to ask questions and express the feedback immediately.

With a training on 4-6 pm everyday, for around 10 days, we can cover so little of python only.

We practiced the code here – https://github.com/tshrinivasan/python-for-kids We used https://www.online-python.com/ as IDE, as the kids have laptops and tablets with different OS. Will install Python on their laptops on next events so that they can explore more python libraries.

On the final day, my friend Jay Varadharajan, gave a Pizza party for all kids, along with a participation certificate.

Thanks for all the questions kids. Along with you, I learnt a lot. Thanks for all the parents for the great support.

PyLadies

Nithya wanted to try out full day training for her friends. Getting a time of 9-5 to learn something is so luxury for many people. Still, around 10 friends participated.

Nithya took the day with all hands-on. She covered the variables, getting input, if/else, for/while loop, string/list operations. The participants were happy to dive into programming so quickly.

“A byte of Python” is a super easy book to learn python. Read it here for free. https://python.swaroopch.com/

Gave this link as asked to read/practice regularly. Hope they are following the book.

Hall for PyLadies meetup

Home as Learning Space

Thus, we are converting our home as a learning space for kids and friends. Thinking of conducting some technical meetups too. ( I am missing all the Linux Users groups meetings and hackathons). Hope we can get more tech events in the winter and make it so interesting and productive.

Welcoming Winter with Python – PyKids and PyLadies

31 January 2025 at 21:30

Winter Break for kids

In Canada, we have around 15 days of winter break for all school kids, covering Christmas and New year.

These celebrations are helping much to come out of the winter worries.

Winter is scary word, but people have to go through it, as life has to go on. As we can not travel much and there are no outdoor events/games, we have to be at home, all the days, weeks and months. Organizing indoor events are costly.

To spend the winter actively, many celebrations days are occurring. Halloween, Christmas, Boxing day, New year, Valentine day and more are there, to make the winter active.

Keeping the kids at home for 17 days winter break is tough. We have to engage them whole day. In our apartment, we are conducting many kids events like weekly chess hour, dance hours, board games day, movie time, sleep over nights etc.

Computer Literacy is good here. Kids are learning to use computer at school, from Grade 3 itself. They play many educational games at school. Homework are done with google slides and google docs, from grade 5. Scratch programming also trained here at grade 5. So, they know very well to use computer, read text online, search the internet and gather some info etc.

PyKids

This time, thought of having some tech events for kids. Called for 10 days training as “PyKids”, for grade 5 and above. The announcement was welcomed well by many parents. We had around 17 kids participated.

As our house is empty mostly, ( thanks to Nithya, for the minimalistic life style ), our hall helped for gathering and teaching.

By keeping the hall empty, we are using the place as Daily Zumba dance hall, mini party hall, DJ hall, kids play area and now as a learning place.

Teaching Python for kids is not easy. The kids are not ready to listen to any long talks. They can not even listen to my regular “python introduction” slides. So, jumped into hands-on on the day one itself.

My mentor, Asokan Pichai explained how we have to goto hands-on on any python training, few months ago. Experienced the benefits of it this time.

Even-though, I am using Python for 10+ years, teaching it to kids was really tough. I had to read few books and read on more basics, so that I can explain the building blocks of python with more relevant examples for kids.

The kids are good at asking questions. They share feedback with their eyes itself. It is a huge different on teaching to adults. Most of the adults don’t ask questions. They hesitate to say they don’t understand something. But, kids are brave enough to ask questions and express the feedback immediately.

With a training on 4-6 pm everyday, for around 10 days, we can cover so little of python only.

We practiced the code here – https://github.com/tshrinivasan/python-for-kids We used https://www.online-python.com/ as IDE, as the kids have laptops and tablets with different OS. Will install Python on their laptops on next events so that they can explore more python libraries.

On the final day, my friend Jay Varadharajan, gave a Pizza party for all kids, along with a participation certificate.

Thanks for all the questions kids. Along with you, I learnt a lot. Thanks for all the parents for the great support.

PyLadies

Nithya wanted to try out full day training for her friends. Getting a time of 9-5 to learn something is so luxury for many people. Still, around 10 friends participated.

Nithya took the day with all hands-on. She covered the variables, getting input, if/else, for/while loop, string/list operations. The participants were happy to dive into programming so quickly.

“A byte of Python” is a super easy book to learn python. Read it here for free. https://python.swaroopch.com/

Gave this link as asked to read/practice regularly. Hope they are following the book.

Hall for PyLadies meetup

Home as Learning Space

Thus, we are converting our home as a learning space for kids and friends. Thinking of conducting some technical meetups too. ( I am missing all the Linux Users groups meetings and hackathons). Hope we can get more tech events in the winter and make it so interesting and productive.

Kanchi Linux Users Group Monthly Meeting – Dec 08, 2024

8 December 2024 at 02:43

Hi everyone,
KanchiLUG’s Monthly meet is scheduled as online meeting this week on Sunday, Dec 08, 2024 17:00 – 18:00 IST

Meeting link : https://meet.jit.si/KanchiLugMonthlyMeet

Can join with any browser or JitSi android app.
All the Discussions are in Tamil.

Talk Details

Talk 0:
Topic : my Elisp ‘load random theme’ function
Description : I wanted to randomly load a theme in Emacs during startup. After i search in online, I achieved this
functionality using Emacs Lisp. this my talk Duration : 10 minutes
Name : Krishna Subramaniyan
About :GNU/Linux and Emacs user 😉

Talk 1:
Topic : PDF generation using python
Description : To demo a python program which will generate a PDF output. Duration : 20 minutes
Name : Sethu
About : Member of KanchiLUG & Kaniyam IRC Channel

Talk 2:
Topic : distrobox – a wrapper on podman/docker
Description : Intro about the tool, why I had to use that and a demo Duration : 15 minutes
Name : Annamalai N
About : a GNU/Linux user

Talk 3:
Topic : Real Time Update Mechanisms (Polling, Long Polling, Server Sent Events)
Description : To demo Real Time Update Mechanisms with JS and Python Duration : 30 minutes
Name :Syed Jafer (parottasalna)
About : Developer. Currently teaching postgres at
https://t.me/parottasalna

After Talks : Q&A, General discussion

About KanchiLUG : Kanchi Linux Users Group [ KanchiLUG ] has been spreading awareness on Free/Open Source Software (F/OSS) in
Kanchipuram since November 2006.

Anyone can join! (Entry is free)
Everyone is welcome
Feel free to share this to your friends

Mailing list: kanchilug@freelists.org
Repository : https://gitlab.com/kanchilug
Twitter handle: @kanchilug
Kanchilug Blog : http://kanchilug.wordpress.com

To subscribe/unsubscribe kanchilug mailing list :
http://kanchilug.wordpress.com/join-mailing-list/

Kanchi Linux Users Group Monthly Meeting – Dec 08, 2024

8 December 2024 at 02:43

Hi everyone,
KanchiLUG’s Monthly meet is scheduled as online meeting this week on Sunday, Dec 08, 2024 17:00 – 18:00 IST

Meeting link : https://meet.jit.si/KanchiLugMonthlyMeet

Can join with any browser or JitSi android app.
All the Discussions are in Tamil.

Talk Details

Talk 0:
Topic : my Elisp ‘load random theme’ function
Description : I wanted to randomly load a theme in Emacs during startup. After i search in online, I achieved this
functionality using Emacs Lisp. this my talk Duration : 10 minutes
Name : Krishna Subramaniyan
About :GNU/Linux and Emacs user 😉

Talk 1:
Topic : PDF generation using python
Description : To demo a python program which will generate a PDF output. Duration : 20 minutes
Name : Sethu
About : Member of KanchiLUG & Kaniyam IRC Channel

Talk 2:
Topic : distrobox – a wrapper on podman/docker
Description : Intro about the tool, why I had to use that and a demo Duration : 15 minutes
Name : Annamalai N
About : a GNU/Linux user

Talk 3:
Topic : Real Time Update Mechanisms (Polling, Long Polling, Server Sent Events)
Description : To demo Real Time Update Mechanisms with JS and Python Duration : 30 minutes
Name :Syed Jafer (parottasalna)
About : Developer. Currently teaching postgres at
https://t.me/parottasalna

After Talks : Q&A, General discussion

About KanchiLUG : Kanchi Linux Users Group [ KanchiLUG ] has been spreading awareness on Free/Open Source Software (F/OSS) in
Kanchipuram since November 2006.

Anyone can join! (Entry is free)
Everyone is welcome
Feel free to share this to your friends

Mailing list: kanchilug@freelists.org
Repository : https://gitlab.com/kanchilug
Twitter handle: @kanchilug
Kanchilug Blog : http://kanchilug.wordpress.com

To subscribe/unsubscribe kanchilug mailing list :
http://kanchilug.wordpress.com/join-mailing-list/

Weekly Notes 48 – 2024

5 December 2024 at 03:18

Christmas Lighting at Niagara

Few weeks ago, Niagara had its Christmas lighting started. Went there with friends. We went there in the evening. Niagara is one of the greatest natural beauty, which we can see a million times. Visited the Casino there. Got 10$ free card and played with few slot machines. Won 3$ and lost all 13$. Though it was a free money, it was tough to stop the game.

Then, visited the wonderful lighting. It was a very long walk, in dark roads, along with glittering lights on the road sides. Kids enjoyed much to see them all.

Few photos are here – https://shrini-clicks.kaniyam.cloudns.nz/#/collections/albums/2024-niagara-chrismas-lighting

Winter Celebration at Heart Comonos

Last Saturday, we had a grand event to celebrate the winter, organized by HeartComonos team. I volunteered a little amount for the event. We had nearly 2 months of preparation. All the volunteers made the event a memorable one. We had around 300 participants. The Bollywood dance team won all the attention. I had a makeover as ELF and was giving candy to all the kids there.

Few pics of the events are here – https://shrini-clicks.kaniyam.cloudns.nz/#/collections/albums/2024-welcoming-winter-2024

Daily IRC meetings for open source project mentoring

We are having daily meeting for open source project mentoring. Around 10 people are doing different projects. We are discussing many things like linux, Emacs, productivity, book reviews etc. Read the logs here – https://ircbot.comm-central.org:8080/kaniyam

2025 planning for kaniyam

Started a thread to plan the 2025 activities for Kaniyam Foundation. Write there what do you think on what we can do next year.

https://forums.tamillinuxcommunity.org/t/topic/2723/2

Revamping FreeTamilEbooks.com

FreeTamilEbooks.com is with a very old theme for past 10+ years. We need the below changes.

  • Check for new theme
  • improve its SEO for search results.
  • Fix the send2kindle links
  • Fix the categories
  • Merge duplicate author names, categories
  • Fix the download stats
  • Get the detailed download report for all the books
  • have author page
  • have contributor page
  • remove email address on the book’s pages
  • add intro content to all books

Created an project idea issue for this here. https://github.com/KaniyamFoundation/ProjectIdeas/issues/237

Ravishankar is the founding member of FreeTamilEbooks.com and a mentor for kaniyam. He started to work on these tasks. He gave a new theme and improved the SEO. Need more volunteers to work on other items. Let me know if you can spend few hours for FreeTamilEbooks.com

Winter / Snow started

Today, we got the very first snow fall of the year. It is mesmerizing to see all the green lands are turning into white. For 4 months, we will be in hibernate state. Have to plan many indoor events. I have tons of books to read, tasks to complete.

LLM dataset part 3 released

We are collecting large amount of Tamil text with shareable, open licensed content, for LLM and other research works. So far, collected 1.6 GB of text, from Tamil Wikipedia, FreeTamilEbooks, project madurai, tamilmunn publishers books, etc. Get the data from here – https://kaniyam.cloudns.nz/tamil_datasets/

Read the blog posts on these here

part 1 – https://goinggnu.wordpress.com/2024/06/11/collecting-content-for-llm-dataset-part-1-tamil-wikipedia-content/

part 2 – https://goinggnu.wordpress.com/2024/06/16/collecting-content-for-llm-dataset-part-2-freetamilebooks/

part 3 – https://goinggnu.wordpress.com/2024/11/23/collecting-content-for-llm-dataset-part-3-thamizh_mann-books-project-madurai-wikisource/

Planning for scratch / python training for kids

Around Dec 25 to Jan 5, we will get winter break for schools. I am thinking of teaching python or scratch for kids in this break. Learning scratch for that. The graphics, the colors, drag/drop may be easy for kids. But it is tough for me as a terminal dweller. Yet to think and plan more on the training for kids. At least, I should teach the basics of programming and show a taste of making computers to obey our orders.

Weekly chess hours for kids

Started to teach chess for our condo kids. we conduct a weekly chess hour, to teach and play with other kids. Good to see that many kids know chess already and they all enjoy the game hours.

Moana 2

Watched Moana 2 yesterday. Viyan loved it. It is very difficult to being a part 2 movie as good as part 1. Moana team did a great job on this. Stunning graphics, good story line, nice music, heart melting songs make the move as a wonder. Don’t miss to watch it in theaters.

Books

completed – பேசத் தெரிந்த நிழல்கள் – sramakrishnan. Took it from a local library. It is a book full of review of world movies. Happy to read a physical tamil book apart from 1000s of miles.

In progress – Drupal, LLM, Digital Museums

100% savings on Thanksgiving Day

As usual, got 100% savings on Thanksgiving Day sales. Bought nothing. As we are following minimalism as much as possible, we feel that we have all the things required. I was thinking, if the name of the day is like “Genocide Memorial Day”, we won’t be rushing for shops to get offers and sales. They should rename the day to show the history.

Self-hosted social media – gotosocial

I am running a social media platform on my desktop, and interacting with the world using that. It is a software called ‘Gotosocial’. It is like twitter, but we can install on our own servers. The Fediverse activitypub protocol connects with other millions of such servers around the world.

Hosted it here https://social.kaniyam.cloudns.nz/

Happy to see many Emacs, FOSS, Linux, Self hosting lovers are there to interact. Getting replies for all the questions I am asking there. It gives much happiness, to be with like-minded people, around the globe.

Weekly Notes 48 – 2024

5 December 2024 at 03:18

Christmas Lighting at Niagara

Few weeks ago, Niagara had its Christmas lighting started. Went there with friends. We went there in the evening. Niagara is one of the greatest natural beauty, which we can see a million times. Visited the Casino there. Got 10$ free card and played with few slot machines. Won 3$ and lost all 13$. Though it was a free money, it was tough to stop the game.

Then, visited the wonderful lighting. It was a very long walk, in dark roads, along with glittering lights on the road sides. Kids enjoyed much to see them all.

Few photos are here – https://shrini-clicks.kaniyam.cloudns.nz/#/collections/albums/2024-niagara-chrismas-lighting

Winter Celebration at Heart Comonos

Last Saturday, we had a grand event to celebrate the winter, organized by HeartComonos team. I volunteered a little amount for the event. We had nearly 2 months of preparation. All the volunteers made the event a memorable one. We had around 300 participants. The Bollywood dance team won all the attention. I had a makeover as ELF and was giving candy to all the kids there.

Few pics of the events are here – https://shrini-clicks.kaniyam.cloudns.nz/#/collections/albums/2024-welcoming-winter-2024

Daily IRC meetings for open source project mentoring

We are having daily meeting for open source project mentoring. Around 10 people are doing different projects. We are discussing many things like linux, Emacs, productivity, book reviews etc. Read the logs here – https://ircbot.comm-central.org:8080/kaniyam

2025 planning for kaniyam

Started a thread to plan the 2025 activities for Kaniyam Foundation. Write there what do you think on what we can do next year.

https://forums.tamillinuxcommunity.org/t/topic/2723/2

Revamping FreeTamilEbooks.com

FreeTamilEbooks.com is with a very old theme for past 10+ years. We need the below changes.

  • Check for new theme
  • improve its SEO for search results.
  • Fix the send2kindle links
  • Fix the categories
  • Merge duplicate author names, categories
  • Fix the download stats
  • Get the detailed download report for all the books
  • have author page
  • have contributor page
  • remove email address on the book’s pages
  • add intro content to all books

Created an project idea issue for this here. https://github.com/KaniyamFoundation/ProjectIdeas/issues/237

Ravishankar is the founding member of FreeTamilEbooks.com and a mentor for kaniyam. He started to work on these tasks. He gave a new theme and improved the SEO. Need more volunteers to work on other items. Let me know if you can spend few hours for FreeTamilEbooks.com

Winter / Snow started

Today, we got the very first snow fall of the year. It is mesmerizing to see all the green lands are turning into white. For 4 months, we will be in hibernate state. Have to plan many indoor events. I have tons of books to read, tasks to complete.

LLM dataset part 3 released

We are collecting large amount of Tamil text with shareable, open licensed content, for LLM and other research works. So far, collected 1.6 GB of text, from Tamil Wikipedia, FreeTamilEbooks, project madurai, tamilmunn publishers books, etc. Get the data from here – https://kaniyam.cloudns.nz/tamil_datasets/

Read the blog posts on these here

part 1 – https://goinggnu.wordpress.com/2024/06/11/collecting-content-for-llm-dataset-part-1-tamil-wikipedia-content/

part 2 – https://goinggnu.wordpress.com/2024/06/16/collecting-content-for-llm-dataset-part-2-freetamilebooks/

part 3 – https://goinggnu.wordpress.com/2024/11/23/collecting-content-for-llm-dataset-part-3-thamizh_mann-books-project-madurai-wikisource/

Planning for scratch / python training for kids

Around Dec 25 to Jan 5, we will get winter break for schools. I am thinking of teaching python or scratch for kids in this break. Learning scratch for that. The graphics, the colors, drag/drop may be easy for kids. But it is tough for me as a terminal dweller. Yet to think and plan more on the training for kids. At least, I should teach the basics of programming and show a taste of making computers to obey our orders.

Weekly chess hours for kids

Started to teach chess for our condo kids. we conduct a weekly chess hour, to teach and play with other kids. Good to see that many kids know chess already and they all enjoy the game hours.

Moana 2

Watched Moana 2 yesterday. Viyan loved it. It is very difficult to being a part 2 movie as good as part 1. Moana team did a great job on this. Stunning graphics, good story line, nice music, heart melting songs make the move as a wonder. Don’t miss to watch it in theaters.

Books

completed – பேசத் தெரிந்த நிழல்கள் – sramakrishnan. Took it from a local library. It is a book full of review of world movies. Happy to read a physical tamil book apart from 1000s of miles.

In progress – Drupal, LLM, Digital Museums

100% savings on Thanksgiving Day

As usual, got 100% savings on Thanksgiving Day sales. Bought nothing. As we are following minimalism as much as possible, we feel that we have all the things required. I was thinking, if the name of the day is like “Genocide Memorial Day”, we won’t be rushing for shops to get offers and sales. They should rename the day to show the history.

Self-hosted social media – gotosocial

I am running a social media platform on my desktop, and interacting with the world using that. It is a software called ‘Gotosocial’. It is like twitter, but we can install on our own servers. The Fediverse activitypub protocol connects with other millions of such servers around the world.

Hosted it here https://social.kaniyam.cloudns.nz/

Happy to see many Emacs, FOSS, Linux, Self hosting lovers are there to interact. Getting replies for all the questions I am asking there. It gives much happiness, to be with like-minded people, around the globe.

Collecting content for LLM dataset – Part 2 – FreeTamilEbooks

16 June 2024 at 02:35

At FreeTamilEbooks.com we have published 850 ebooks. All in sharable creative commons license. There are many people asking for the text only content of all these books many times. As it is a big task, took long time for it. Thanks to Lenin, Anwar of Kaniyam Foundation, all the contributors, all the writers and readers for making this project alive and a great success.

We are publishing the books as epub format, along with PDF format. Epub is just a zip file of HTML files. So, we can copy all the content from it as unicode text. Pandoc is a wonderful open source software, which can convert an epub to plaintext file.

There are the list of actions we have to do.

  1. Get URLs of all the 850+ epub files
  2. Download them all.
  3. using pandoc, convert to text file.

So far, we dont have a metadata file for all the books published. Getting the links of all epub files need some programming. As Python is a swiss knife to automate anything, started to explore the wordpress REST api with python to get all the books pages content.

https://github.com/KaniyamFoundation/create_ebooks/blob/master/get_metadata/get_Data.py

Wrote the code here to get all the books info.

This gave a JSON file with book name, author, genre, epub, mobi, a4 pdf,6 inch pdf links.

Converted this to a CSV file with the below code. https://github.com/KaniyamFoundation/create_ebooks/blob/master/get_metadata/parse.py

I had to fix few things manually on the CSV file.

This is the final CSV file. https://github.com/KaniyamFoundation/create_ebooks/blob/master/get_metadata/fte_metadata.csv

The below code is to download all the epub files from their links in the fte_metadata.csv file. Used pandoc to convert to text.

https://github.com/KaniyamFoundation/create_ebooks/blob/master/get_metadata/get_fte_books.py

Got 845 txt files. Total size is 374 MB

Compressed with 7z to get 47MB compressed file.

Published the data here. https://kaniyam.cloudns.nz/tamil_datasets/

Download, share the text data for free. Dont sell them as most of the books are released as CC-BY-NC ( No Commercial ) license.

Use these data to build awesome open source applications and researches like Spellchekers, grammar checkers, LLm, RAG, what not?

Data is always the oil. Let us grow the open data oil.

Please share all your text, audio, video content in sharable license like creative commons. They will use to build a better future.

Collecting content for LLM dataset – Part 3 – Thamizh_Mann books, project madurai, WikiSource

23 November 2024 at 00:34

We are collecting open licensed dataset in tamil language, to build LLM, and other interesting applications in the coming days.

The ML models we build may have very short lifespan, but the open data will be there forever or at least for longer time than our life time.

Check the efforts part 1 and part 2 here.

part 1 – https://goinggnu.wordpress.com/2024/06/11/collecting-content-for-llm-dataset-part-1-tamil-wikipedia-content/

part 2 – https://goinggnu.wordpress.com/2024/06/16/collecting-content-for-llm-dataset-part-2-freetamilebooks/

here goes part 3.

Thamizh_mann publishers are publishing the public domain and nationalized tamil books for many years. Few years ago, with a collaboration with the Library at University of Toronto, Scarborough, Canada, and Thamizh_mann publishers, the kaniyam foundation team helped to release all the 1000+ tamil books as PDF and Docx formats for free online.

You can download them all here https://tamil.digital.utsc.utoronto.ca/61220/utsc35335

Thanks to UTSC, Thamizh_mann team for the great gift for the tamil Diaspora.

Now, we have 1000+ books in Unicode Docx format. Next is to convert them all as PlainText and use them. Natkeeran and Parathan helped on this.

Along with this, they helped to scrap project madurai books and tamil WikiSource books. They published all in a git repo here – https://github.com/KaniyamFoundation/open_tamil_texts along with the scripts and metadata.

I am adding those text in our open licensed tamil data collection.

Download them all here https://kaniyam.cloudns.nz/tamil_datasets/

here is the current size in text format and compressed format.

shrini@dell-optiplex-9100 v/w/h/tamil_datasets> du -h compressed
258M compressed/

shrini@dell-optiplex-9100 v/w/h/tamil_datasets> du -h text-files
355M text-files/project_madurai/data/text
355M text-files/project_madurai/data
355M text-files/project_madurai
110M text-files/tamil_wikisource/data
110M text-files/tamil_wikisource
374M text-files/FreeTamilEbooks-txt
714M text-files/thamizh_mann/data
716M text-files/thamizh_mann
1.6G text-files/

We have 1.6 G of text data to work on LLM or other works.

Go ahead, use it and build more models and tools using this data.

Hope this may not enough to get any good output. But, if we can bring something out of this, even though they are not good, then we can ask people to release their recent contents, blogs, social media posts in creative commons license.

There are few bloggers, magazines are already released their content in CC license. Now, we need your help to scarp them. If you know any programming language and can help for this project, please do webscrapping for the websites mentioned here. share the data and code.

https://github.com/KaniyamFoundation/ProjectIdeas/issues/198

Thanks for all the content providers and the contributors.

Collecting content for LLM dataset – Part 3 – Thamizh_Mann books, project madurai, WikiSource

23 November 2024 at 00:34

We are collecting open licensed dataset in tamil language, to build LLM, and other interesting applications in the coming days.

The ML models we build may have very short lifespan, but the open data will be there forever or at least for longer time than our life time.

Check the efforts part 1 and part 2 here.

part 1 – https://goinggnu.wordpress.com/2024/06/11/collecting-content-for-llm-dataset-part-1-tamil-wikipedia-content/

part 2 – https://goinggnu.wordpress.com/2024/06/16/collecting-content-for-llm-dataset-part-2-freetamilebooks/

here goes part 3.

Thamizh_mann publishers are publishing the public domain and nationalized tamil books for many years. Few years ago, with a collaboration with the Library at University of Toronto, Scarborough, Canada, and Thamizh_mann publishers, the kaniyam foundation team helped to release all the 1000+ tamil books as PDF and Docx formats for free online.

You can download them all here https://tamil.digital.utsc.utoronto.ca/61220/utsc35335

Thanks to UTSC, Thamizh_mann team for the great gift for the tamil Diaspora.

Now, we have 1000+ books in Unicode Docx format. Next is to convert them all as PlainText and use them. Natkeeran and Parathan helped on this.

Along with this, they helped to scrap project madurai books and tamil WikiSource books. They published all in a git repo here – https://github.com/KaniyamFoundation/open_tamil_texts along with the scripts and metadata.

I am adding those text in our open licensed tamil data collection.

Download them all here https://kaniyam.cloudns.nz/tamil_datasets/

here is the current size in text format and compressed format.

shrini@dell-optiplex-9100 v/w/h/tamil_datasets> du -h compressed
258M compressed/

shrini@dell-optiplex-9100 v/w/h/tamil_datasets> du -h text-files
355M text-files/project_madurai/data/text
355M text-files/project_madurai/data
355M text-files/project_madurai
110M text-files/tamil_wikisource/data
110M text-files/tamil_wikisource
374M text-files/FreeTamilEbooks-txt
714M text-files/thamizh_mann/data
716M text-files/thamizh_mann
1.6G text-files/

We have 1.6 G of text data to work on LLM or other works.

Go ahead, use it and build more models and tools using this data.

Hope this may not enough to get any good output. But, if we can bring something out of this, even though they are not good, then we can ask people to release their recent contents, blogs, social media posts in creative commons license.

There are few bloggers, magazines are already released their content in CC license. Now, we need your help to scarp them. If you know any programming language and can help for this project, please do webscrapping for the websites mentioned here. share the data and code.

https://github.com/KaniyamFoundation/ProjectIdeas/issues/198

Thanks for all the content providers and the contributors.

Introduction to PostgreSQL database – free online course in Tamil

18 November 2024 at 02:26

Introduction to PostgreSQL database – free online course in Tamil

Monday, wednesday, Friday IST evening.

First class – 18-Nov-2024 7-8 PM IST

Syllabus: https://parottasalna.com/postgres-database-syllabus/

Trainer – Syed Jafer – contact.syedjafer@gmail.com

Get the meeting link here

Telegram Group – https://t.me/parottasalna
Whatsapp channel- https://whatsapp.com/channel/0029Vavu8mF2v1IpaPd9np0s Kaniyam Tech events Calendar – https://kaniyam.com/events/

Introduction to PostgreSQL database – free online course in Tamil

18 November 2024 at 02:26

Introduction to PostgreSQL database – free online course in Tamil

Monday, wednesday, Friday IST evening.

First class – 18-Nov-2024 7-8 PM IST

Syllabus: https://parottasalna.com/postgres-database-syllabus/

Trainer – Syed Jafer – contact.syedjafer@gmail.com

Get the meeting link here

Telegram Group – https://t.me/parottasalna
Whatsapp channel- https://whatsapp.com/channel/0029Vavu8mF2v1IpaPd9np0s Kaniyam Tech events Calendar – https://kaniyam.com/events/

kanchilug – Monthly Meeting – Nov 10, 2024

10 November 2024 at 02:40

Hi everyone,
KanchiLUG’s Monthly meet is scheduled as online meeting this week on Sunday, Nov 10, 2024 17:00 – 18:00 IST

Meeting link : https://meet.jit.si/KanchiLugMonthlyMeet

Can join with any browser or JitSi android app.
All the Discussions are in Tamil.

Talk Details

Talk 0:
Topic : Postgres Architecture
Description : In this talk, we will explore the architecture of postgres Duration : 30 mins
Name : Sethupandian
About : My name is Sethu and I work as a practice manager for an Insurance company in Canada. Back in India, I am from Salem. Completed my engineering in Electrical & Electronics, at Kongu Engineering College(2000-2004). Started my IT career in the year 2005 and worked in companies like Ramco Systems, Verizon, TCS, Cognizant before joining my current employer. I have always got an interest towards learning things that is fascinating. And through Payilagam and Muthu sir, I came to know about Kaniyam and KanchiLUG. I am happy to be part of this great initiative. I wish and hope I can contribute whatever possible from my side.

Talk 1:
Topic : Intro to GDB
Description : Based on my recent translation of Beej’s guide to Tamil on same. Duration : 20 mins
Name : Annamalai N
About : a GNU/Linux user interested in Embedded Systems. Final year engineering undergrad.

After Talks : Q&A, General discussion

About KanchiLUG : Kanchi Linux Users Group [ KanchiLUG ] has been spreading awareness on Free/Open Source Software (F/OSS) in
Kanchipuram since November 2006.

Anyone can join! (Entry is free)
Everyone is welcome
Feel free to share this to your friends

kanchilug – Monthly Meeting – Nov 10, 2024

10 November 2024 at 02:40

Hi everyone,
KanchiLUG’s Monthly meet is scheduled as online meeting this week on Sunday, Nov 10, 2024 17:00 – 18:00 IST

Meeting link : https://meet.jit.si/KanchiLugMonthlyMeet

Can join with any browser or JitSi android app.
All the Discussions are in Tamil.

Talk Details

Talk 0:
Topic : Postgres Architecture
Description : In this talk, we will explore the architecture of postgres Duration : 30 mins
Name : Sethupandian
About : My name is Sethu and I work as a practice manager for an Insurance company in Canada. Back in India, I am from Salem. Completed my engineering in Electrical & Electronics, at Kongu Engineering College(2000-2004). Started my IT career in the year 2005 and worked in companies like Ramco Systems, Verizon, TCS, Cognizant before joining my current employer. I have always got an interest towards learning things that is fascinating. And through Payilagam and Muthu sir, I came to know about Kaniyam and KanchiLUG. I am happy to be part of this great initiative. I wish and hope I can contribute whatever possible from my side.

Talk 1:
Topic : Intro to GDB
Description : Based on my recent translation of Beej’s guide to Tamil on same. Duration : 20 mins
Name : Annamalai N
About : a GNU/Linux user interested in Embedded Systems. Final year engineering undergrad.

After Talks : Q&A, General discussion

About KanchiLUG : Kanchi Linux Users Group [ KanchiLUG ] has been spreading awareness on Free/Open Source Software (F/OSS) in
Kanchipuram since November 2006.

Anyone can join! (Entry is free)
Everyone is welcome
Feel free to share this to your friends

Weekly Notes 44 – 2024

5 November 2024 at 04:20

  • For the past few months, I was preparing for an English Exam called CELPIP. It is an exam to check the Listening, Reading, Writing and Speaking. Though we know English, preparing for an exam is an exhausting one. We took online training from “Galaxy Training Academy“. https://galaxytraining.in The coach “Jay Kumar” gave nice intro about exam pattern. He gave many mock tests and gave good feedback on how to improve, on each test. Last month, Nithya and I cleared the exam. It is a good feel to released from exam fear. Postponed many activities because of the exam preparation. Will roll out them all soon. If you are preparing for any English Exam, I suggest taking training and mock tests with “Galaxy Training Academy”.

    ——

  • In Canada, Daylight saving ended yesterday. This happens every year in fall season and referred as “Fallback”. The clocks are moved one hour back. This is to adjust the dark winter season. It seems like all in a sudden, we got one hour extra to sleep in morning. 🙂

    ——

  • On Oct 31, we had Deepavali, Halloween and our Marriage day. Deepavali day went with great remembering our childhood memories. The evening was filled with fun, as we went to neighbourhood houses, with friends and kids, to play “Trick or Treat”. Saw many weird, spooky decorated houses and people. Kids collected a bag full of chocolates. Last year, it was too cold. This year, the same day had a nice weather, to roam around in the evening.

    ——

  • On Nov 1, we celebrated Deepavali with firing crackers. Bought a few crackers, which emit light. Here, we don’t get loud-full crackers like atom bombs, 1000 piece fireworks shots etc. With limited available crackers, kids enjoyed firing them, with all their friends together.

    ——

  • On Nov 8, we are planning for a mega Deepavali event with around 250 people here. I am contributing on the planning/photography. Nithya and kids are practicing dance with their friends. Hope it will be a fun-filled evening.


    ——

  • On Nov 2, gave a talk on tolkappiyam Canada monthly meeting, about our efforts on writing python code for tamil grammar rules in Tolkappiyam book. It was a good meeting. Few of the participants accepted to collaborate. You can read our progress here – https://github.com/KaniyamFoundation/ProjectIdeas/issues/214


    ——

  • Few weeks ago, gave a talk on open-tamil python library, at Kanchi Linux Users Group ( KanchiLUG) meet. Video is here – https://www.youtube.com/watch?v=GtIrbvw2V-w


    ——

  • Kids started going to tamil school on every Saturday morning. This week, they received books. Viyan is good at Tamil and English. Iyal started to read Tamil and English. Paari is trying to learn writing.

    ——

  • We conduct daily meetings in a text based chat system called IRC (Internet Relay Chat). daily, 7-8 pm IST. Good to see many people are joining and discussing many things about open source software and mentoring to contribute to open source software. More details here – https://goinggnu.wordpress.com/2024/10/21/open-source-projects-mentoring-via-irc/

    ——

  • Practicing Manual mode in photography for few weeks. Feeling like learning linux and Emacs. It gives the most flexible options and results are stunning. It is better to learn it in early days, so that we can do more magics with lighting.


    ——


    The one thing I follow in photography is – shoot a lot, share a little. I keep and share only 10%. All others are deleted. Though it is hard to select the best photos, sharing only 10% is easy for viewers and brings a Wow from them.


    ——
  • Completed reading books in last week.
  • currently reading these books.

Weekly Notes 44 – 2024

5 November 2024 at 04:20

  • For the past few months, I was preparing for an English Exam called CELPIP. It is an exam to check the Listening, Reading, Writing and Speaking. Though we know English, preparing for an exam is an exhausting one. We took online training from “Galaxy Training Academy“. https://galaxytraining.in The coach “Jay Kumar” gave nice intro about exam pattern. He gave many mock tests and gave good feedback on how to improve, on each test. Last month, Nithya and I cleared the exam. It is a good feel to released from exam fear. Postponed many activities because of the exam preparation. Will roll out them all soon. If you are preparing for any English Exam, I suggest taking training and mock tests with “Galaxy Training Academy”.

    ——

  • In Canada, Daylight saving ended yesterday. This happens every year in fall season and referred as “Fallback”. The clocks are moved one hour back. This is to adjust the dark winter season. It seems like all in a sudden, we got one hour extra to sleep in morning. 🙂

    ——

  • On Oct 31, we had Deepavali, Halloween and our Marriage day. Deepavali day went with great remembering our childhood memories. The evening was filled with fun, as we went to neighbourhood houses, with friends and kids, to play “Trick or Treat”. Saw many weird, spooky decorated houses and people. Kids collected a bag full of chocolates. Last year, it was too cold. This year, the same day had a nice weather, to roam around in the evening.

    ——

  • On Nov 1, we celebrated Deepavali with firing crackers. Bought a few crackers, which emit light. Here, we don’t get loud-full crackers like atom bombs, 1000 piece fireworks shots etc. With limited available crackers, kids enjoyed firing them, with all their friends together.

    ——

  • On Nov 8, we are planning for a mega Deepavali event with around 250 people here. I am contributing on the planning/photography. Nithya and kids are practicing dance with their friends. Hope it will be a fun-filled evening.


    ——

  • On Nov 2, gave a talk on tolkappiyam Canada monthly meeting, about our efforts on writing python code for tamil grammar rules in Tolkappiyam book. It was a good meeting. Few of the participants accepted to collaborate. You can read our progress here – https://github.com/KaniyamFoundation/ProjectIdeas/issues/214


    ——

  • Few weeks ago, gave a talk on open-tamil python library, at Kanchi Linux Users Group ( KanchiLUG) meet. Video is here – https://www.youtube.com/watch?v=GtIrbvw2V-w


    ——

  • Kids started going to tamil school on every Saturday morning. This week, they received books. Viyan is good at Tamil and English. Iyal started to read Tamil and English. Paari is trying to learn writing.

    ——

  • We conduct daily meetings in a text based chat system called IRC (Internet Relay Chat). daily, 7-8 pm IST. Good to see many people are joining and discussing many things about open source software and mentoring to contribute to open source software. More details here – https://goinggnu.wordpress.com/2024/10/21/open-source-projects-mentoring-via-irc/

    ——

  • Practicing Manual mode in photography for few weeks. Feeling like learning linux and Emacs. It gives the most flexible options and results are stunning. It is better to learn it in early days, so that we can do more magics with lighting.


    ——


    The one thing I follow in photography is – shoot a lot, share a little. I keep and share only 10%. All others are deleted. Though it is hard to select the best photos, sharing only 10% is easy for viewers and brings a Wow from them.


    ——
  • Completed reading books in last week.
  • currently reading these books.

Open Source projects mentoring via IRC

21 October 2024 at 02:46

In the programming world, if you say as ‘ I prefer watching videos, than reading docs’ it means you are a programmer already or you won’t become a programmer.

Do you feel that you are struggling to be a good programmer, even after watching 100s of hours of videos?

Let me share one secret. It is the fear of reading and writing PlainText. The more you go away from reading and writing, programming will go away from you.

Programming is all about dealing with the code, error messages, log files, documentation. All in PlainText. We have emails, tickets, docs, reports too there on the stack of IT life.

If you love terminal and PlainText tools, you are already into reading and writing. The more you read and write, the more you can get clarity in thinking, which is the essential part of programming.

To embrace the simplicity and powers of PlainText, few friends started to discuss in IRC. yes, the same 40+ years old Internet Relay Chat, an chat system which built the internet itself via chat.

Thanks to Indian Linux Users Group, Chennai, KanchiLUG, Kaniyam Foundation friends for joining the chat.

Read my post on why I like IRC here https://goinggnu.wordpress.com/2020/04/14/why-i-like-irc-internet-relay-chat-even-in-2020/

Here is small video in Tamil by my friend Muthuramalingam of Payilagam – https://www.youtube.com/watch?v=CGurYNb0BM8

From today, 7-8 IST evenings, we can discuss at channel at irc.libera.chat

I suggest a terminal based chat client “weechat”

But, for a quick connection, use this link to join and discuss. https://web.libera.chat/gamja/#kaniyam

Start Date – 21 Oct 2024 ( Monday to Friday )
Time – 7-8 pm IST
server – irc.libera.chat
channel –

read the chat logs here – https://ircbot.comm-central.org:8080/kaniyam

join and say something about you.

  • feel like a hacker by chatting with people in your linux terminal
  • get mentored on hactoberfest
  • ask any questions on linux/python/programming/devops
  • share your daily progress on learning and programming
  • practice reading and writing PlainText
  • learn slowly and strongly

See you at IRC.

If you are interested in mentoring students for open source projects, please join and start the discussions.

The other interesting channels that people chat are #ubuntu  you can join there and participate on the discussions anytime.

Open Source projects mentoring via IRC

21 October 2024 at 02:46

In the programming world, if you say as ‘ I prefer watching videos, than reading docs’ it means you are a programmer already or you won’t become a programmer.

Do you feel that you are struggling to be a good programmer, even after watching 100s of hours of videos?

Let me share one secret. It is the fear of reading and writing PlainText. The more you go away from reading and writing, programming will go away from you.

Programming is all about dealing with the code, error messages, log files, documentation. All in PlainText. We have emails, tickets, docs, reports too there on the stack of IT life.

If you love terminal and PlainText tools, you are already into reading and writing. The more you read and write, the more you can get clarity in thinking, which is the essential part of programming.

To embrace the simplicity and powers of PlainText, few friends started to discuss in IRC. yes, the same 40+ years old Internet Relay Chat, an chat system which built the internet itself via chat.

Thanks to Indian Linux Users Group, Chennai, KanchiLUG, Kaniyam Foundation friends for joining the chat.

Read my post on why I like IRC here https://goinggnu.wordpress.com/2020/04/14/why-i-like-irc-internet-relay-chat-even-in-2020/

Here is small video in Tamil by my friend Muthuramalingam of Payilagam – https://www.youtube.com/watch?v=CGurYNb0BM8

From today, 7-8 IST evenings, we can discuss at #kaniyam channel at irc.libera.chat

I suggest a terminal based chat client “weechat”

But, for a quick connection, use this link to join and discuss. https://web.libera.chat/gamja/#kaniyam

Start Date – 21 Oct 2024 ( Monday to Friday )
Time – 7-8 pm IST
server – irc.libera.chat
channel – #kaniyam

read the chat logs here – https://ircbot.comm-central.org:8080/kaniyam

join and say something about you.

  • feel like a hacker by chatting with people in your linux terminal
  • get mentored on hactoberfest
  • ask any questions on linux/python/programming/devops
  • share your daily progress on learning and programming
  • practice reading and writing PlainText
  • learn slowly and strongly

See you at IRC.

If you are interested in mentoring students for open source projects, please join and start the discussions.

The other interesting channels that people chat are #ilugc #dgplug #emacs #kde #ubuntu  you can join there and participate on the discussions anytime.

❌
❌