After a long time, I am writing this blog as I had an hectic
semester (Semester 6) with acads, Spider R&D, BMS (Battery
Management Systems) project and participating in
TOP-IMSD'24 (Will write one
about this in future!)
(Ah Placement Exams too :|).
Then I had went for an Internship to a company as a Hardware
Research Intern but I was told to automate some testing instruments
like DSO, Load Analyzer, Logic Analyzer etc.
Finally after all these I entered into my final year and having some
peace i.e time to work on some other exciting projects and improve
myself!. So I'm planning to be a little regular in blogging which
implies that I do will spend some time exploring new.
Notes from KLUG session
Today I attended the Kanchipuram Linux Users Group's (KLUG) Weekly
Meet for a while and I got to know about,
To manage env, config management for whatever software, programs you
develop in whichever language and also solving all problems
previously was with dotenv !
soju
It is a IRC Bouncer that can be used for logging
data from IRC channels. Need to explore on how to use and configure
it.
Also got to know about how to generate PDFs of websites using
headless chromium.
SELECT e.*, d.DepartmentName
FROM Employees e
JOIN Departments d ON e.DepartmentID = d.DepartmentID;
SELECT e.EmpName AS Employee, m.EmpName AS Manager
FROM Employees e
JOIN Employees m ON e.ManagerID = m.EmpID;
INNER JOIN:
Returns only the rows where there is a match between the columns in both tables.
If no match is found, the row is not included in the result.
Itβs the most common type of join.
OUTER JOIN:
Returns all rows from one or both tables, even if there is no match in the other table.
LEFT OUTER JOIN (or just LEFT JOIN): Returns all rows from the left table, and the matched rows from the right table. If no match, the result will have NULL values for columns from the right table.
RIGHT OUTER JOIN (or just RIGHT JOIN): Returns all rows from the right table, and the matched rows from the left table. If no match, the result will have NULL values for columns from the left table.
FULL OUTER JOIN: Returns all rows from both tables. If there is no match, the result will have NULL values for the non-matching tableβs columns.
GROUP BY:
Groups rows that have the same values in specified columns into summary rows (like finding the total count, sum, average, etc.).
It is typically used with aggregate functions such as COUNT(), SUM(), AVG(), MAX(), MIN().
HAVING:
Used to filter records after the GROUP BY has been applied.
It works similarly to the WHERE clause, but WHERE is used for filtering individual rows before grouping, while HAVING filters the grouped results.
SELECT DeptName, COUNT(*)
FROM Employees
GROUP BY DeptName;
DISTINCT:
Used to remove duplicate rows from the result set based on the specified columns.
If you specify only one column, it will return the distinct values of that column.
If you specify multiple columns, the combination of values in those columns will be considered to determine uniqueness.
SELECT DISTINCT DeptName FROM Employees;
SELECT DISTINCT DeptName, EmpName FROM Employees;
Difference between DELETE and TRUNCATE:
Removes rows one by one and logs each deletion, which can be slower for large datasets.
You can use a WHERE clause to specify which rows to delete.
Can be rolled back if youβre working within a transaction (assuming no COMMIT has been done).
Can fire triggers if there are any triggers defined on the table (for example, BEFORE DELETE or AFTER DELETE triggers).
TRUNCATE:
Removes all rows in the table in one go, without scanning them individually.
Does not support a WHERE clause, so it always deletes all rows.
Itβs much faster than DELETE because it doesnβt log individual row deletions (but it does log the deallocation of the tableβs data pages).
Cannot be rolled back in most databases (unless in a transaction, depending on the DBMS), and there are no triggers involved.
UNION:
Combines the results of two or more queries.
Removes duplicates: Only unique rows are included in the final result.
It performs a sort operation to eliminate duplicates, which can have a slight performance cost.
UNION ALL:
Also combines the results of two or more queries.
Keeps duplicates: All rows from the queries are included in the final result, even if they are the same.
It doesnβt perform the sort operation, which usually makes it faster than UNION.
SELECT EmpID, EmpName FROM Employees
UNION ALL
SELECT EmpID, EmpName FROM Contractors;
SELECT EmpID, EmpName FROM Employees
UNION
SELECT EmpID, EmpName FROM Contractors;
COALESCE():
First Non null value will be taken, For ex. in select statement, some names are null, that time some default value can be used or another field value. SELECT COALESCE(NULL, βHelloβ, βWorldβ); Output: Hello
SELECT id, COALESCE(name, nickname, βUnknownβ) AS display_name FROM users;
NULLIF()
NULLIF(expression1, expression2) Returns null if both expressions or column values are equal, else return first the first column value, ie expression1 SELECT NULLIF(10, 10); β Output: NULL SELECT NULLIF(10, 20); β Output: 10 SELECT NULLIF(10, NULL) OR β Output: 10 SELECT NULLIF(NULL, 10) β Output: NULL
IF Condition:
The IF statement is used to check conditions and execute SQL code accordingly.
IF condition THEN
-- Code to execute if the condition is true
ELSIF condition THEN
-- Code block to execute if another condition is true
ELSE
-- Code to execute if the condition is false
END IF;
IF NOT FOUND THEN
RAISE NOTICE 'Employee with ID % not found!', emp_id;
emp_bonus := 0;
END IF;
CASE WHEN:
The CASE WHEN expression is used for conditional logic within a query (similar to IF but more flexible in SQL).
SELECT
name,
salary,
CASE
WHEN salary > 5000 THEN 'High Salary'
WHEN salary BETWEEN 3000 AND 5000 THEN 'Average Salary'
ELSE 'Low Salary'
END AS salary_category
FROM employees;
FOR LOOP:
DECLARE
i INT;
BEGIN
FOR i IN 1..5 LOOP
-- Perform an action for each iteration (e.g., insert or update a record)
INSERT INTO audit_log (action, timestamp)
VALUES ('Employee update', NOW());
END LOOP;
END;
FOR record IN SELECT column1, column2 FROM employees LOOP
-- Code block using record.column1, record.column2
END LOOP;
RAISE β used for printing something (SOP in java)
RAISE NOTICE βEmployee: %, Salary: %β, emp_name, emp_salary; RAISE EXCEPTION βAn error occurred: %β, error_message; β This will print and halt the execution. RAISE INFO βEmployee: %, Salary: %β, emp_name, emp_salary;
Stored Procedures in SQL:
A stored procedure is a reusable block of SQL code that performs specific tasks. It is stored in the database and can be called as needed. Stored procedures are used for:
Modularizing complex SQL logic.
Improving performance by reducing network traffic.
Ensuring code reuse and security (by granting permissions to execute rather than to the tables directly).
Example:
A stored procedure to insert a new employee record:
CREATE PROCEDURE add_employee(emp_name VARCHAR, emp_salary NUMERIC)
LANGUAGE plpgsql AS
$$
BEGIN
INSERT INTO employees (name, salary) VALUES (emp_name, emp_salary);
END;
$$;
Execution:
CALL add_employee(βJohn Doeβ, 50000);
Functions in SQL:
A SQL function is a reusable block of SQL code that performs specific tasks. It is stored in the database and can be called as needed. It is similar to a procedure but returns a single value or table. Functions are typically used for computations or transformations. Example: A function to calculate the yearly salary:
CREATE FUNCTION calculate_yearly_salary(monthly_salary NUMERIC)
RETURNS NUMERIC
LANGUAGE plpgsql AS
$$
BEGIN
RETURN monthly_salary * 12;
END;
$$;
Execution:
SELECT calculate_yearly_salary(5000); OR EXECUTE calculate_yearly_salary(5000); (If we are using inside a trigger)
Key Differences Between Procedures and Functions:
Return Type:
Function: Always returns a value.
Procedure: Does not return a value.
Usage:
Function: Can be used in SQL queries (e.g., SELECT).
Procedure: Called using CALL, cannot be used in SQL queries.
Transaction Control:
Function: Cannot manage transactions.
Procedure: Can manage transactions (e.g., COMMIT, ROLLBACK).
Side Effects:
Function: Should not have side effects (e.g., modifying data).
Procedure: Can modify data and have side effects.
Calling Mechanism:
Procedure: Called using CALL procedure_name().
Function: Called within SQL expressions, like SELECT function_name().
TRIGGER:
A trigger is a special kind of stored procedure that automatically executes (or βfiresβ) when certain events occur in the database, such as INSERT, UPDATE, or DELETE. Triggers can be used to enforce business rules, validate data, or maintain audit logs. Key Points:
Types of Triggers:
BEFORE Trigger: Fires before the actual operation (INSERT, UPDATE, DELETE).
AFTER Trigger: Fires after the actual operation.
INSTEAD OF Trigger: Used to override the standard operation, useful in views. (This is in SQL Server only not in postgres)
Trigger Actions: The trigger action can be an operation like logging data, updating related tables, or enforcing data integrity.
Trigger Events: A trigger can be set to fire on certain events, such as when a row is inserted, updated, or deleted.
Trigger Scope: Triggers can be defined to act on either a row (executing once for each affected row) or a statement (executing once for the entire statement).
A trigger can be created to log changes in a Users table whenever a record is updated, or it could prevent deleting a record if certain conditions arenβt met.
Example:
CREATE TRIGGER LogEmployeeAgeUpdate
AFTER UPDATE ON Employees
FOR EACH ROW
BEGIN
IF OLD.Age <> NEW.Age THEN
INSERT INTO EmployeeLogs (EmployeeID, OldAge, NewAge)
VALUES (OLD.EmployeeID, OLD.Age, NEW.Age);
END IF;
END;
Example:
CREATE OR REPLACE FUNCTION prevent_employee_delete()
RETURNS TRIGGER AS
$$
BEGIN
-- Check if the employee is in a protected department (for example, department_id = 10)
IF OLD.department_id = 10 THEN
RAISE EXCEPTION 'Cannot delete employee in department 10';
END IF;
RETURN OLD;
END;
$$
LANGUAGE plpgsql;
-- Attach the function to a trigger
CREATE TRIGGER prevent_employee_delete_trigger
BEFORE DELETE ON Employees
FOR EACH ROW
EXECUTE FUNCTION prevent_employee_delete();
Creates a trigger which is used to log age and related whenever insert, delete, update action on employee rows:
CREATE OR REPLACE FUNCTION log_employee_changes()
RETURNS TRIGGER AS
$$
BEGIN
-- Handle INSERT operation
IF (TG_OP = 'INSERT') THEN
INSERT INTO EmployeeChangeLog (EmployeeID, OperationType, NewAge, ChangeTime)
VALUES (NEW.EmployeeID, 'INSERT', NEW.Age, CURRENT_TIMESTAMP);
RETURN NEW;
-- Handle UPDATE operation
ELSIF (TG_OP = 'UPDATE') THEN
INSERT INTO EmployeeChangeLog (EmployeeID, OperationType, OldAge, NewAge, ChangeTime)
VALUES (OLD.EmployeeID, 'UPDATE', OLD.Age, NEW.Age, CURRENT_TIMESTAMP);
RETURN NEW;
-- Handle DELETE operation
ELSIF (TG_OP = 'DELETE') THEN
INSERT INTO EmployeeChangeLog (EmployeeID, OperationType, OldAge, ChangeTime)
VALUES (OLD.EmployeeID, 'DELETE', OLD.Age, CURRENT_TIMESTAMP);
RETURN OLD;
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER log_employee_changes_trigger
AFTER INSERT OR UPDATE OR DELETE
ON Employees
FOR EACH ROW
EXECUTE FUNCTION log_employee_changes();
Step 3: Attach the Trigger to the Employees Table
Now that we have the function, we can attach it to the Employees table to log changes. Weβll create a trigger that fires on insert, update, and delete operations.
TG_OP: This is a special variable in PostgreSQL that holds the operation type (either INSERT, UPDATE, or DELETE). NEW and OLD: These are references to the row being inserted or updated (NEW) or the row before it was updated or deleted (OLD). EmployeeChangeLog: This table stores the details of the changes (employee ID, operation type, old and new values, timestamp). β Programmer defined.
What happens when you omit FOR EACH ROW?
Statement-Level Trigger: The trigger will fire once per SQL statement, regardless of how many rows are affected. This means it wonβt have access to the individual rows being modified.
For example, if you run an UPDATE statement that affects 10 rows, the trigger will fire once (for the statement) rather than for each of those 10 rows.
No Access to Row-Specific Data: You wonβt be able to use OLD or NEW values to capture the individual rowβs data. The trigger will just execute as a whole, without row-specific actions.
With FOR EACH ROW: The trigger works on each row affected, and you can track specific changes (e.g., old vs new values).Without FOR EACH ROW: The trigger fires once per statement and doesnβt have access to specific row data.
CREATE TRIGGER LogEmployeeAgeUpdate
AFTER UPDATE ON Employees
BEGIN
-- Perform some operation, but it won't track individual rows.
INSERT INTO AuditLogs (EventDescription)
VALUES ('Employees table updated');
END;
NORMALIZATION:
1st NF:
Each column/attribute should have atomic value or indivisible value, ie only one value.
Rows should not be repeated, ie unique rows, there is not necessary to have PKey here.
2nd NF:
Must fulfill the 1st NF. [cadidate key(composite key to form the uniqueness)]
All non-candidate-key columns should be fully dependent on the each attribute/column of the composite keys to form the cadidate key. For ex. If the DB is in denormalalized form (ie before normalization, all tables and values are together in a single table) and the candidate key is (orderId+ProductId), then the non-key(not part of the candidate key) if you take orderdate, orderedStatus, qty, item_price are not dependent on each part of the candidate key ie it depends only orderId, not ProductId, ProductName are not dependent on Order, like that customer details are not dependent on ProductId. So only related items should be there in a table, so the table is partitioned based on the column values, so that each attribute will depend on its candidate key. So Products goto separate table, orders separate and customers going to separate table.
Primary key is created based for each separated table and ensure that all non-key columns completely dependent on the primary key. Then the foreign key relationships also established to connect all the tablesis not fullly dependent on.
3rd NF:
Must fulfill till 2ndNF.
Remove the transitional dependency (In a decentralized DB, One column value(Order ID) is functionally dependent on another column(Product ID) and OrderId is functionally dependent on the OrderId, so that disturbing one value will affect another row with same column value), so to avoid that separate the table, for Ex. from orders table Sales Peopleβs data is separated.
What is a Transitive Dependency? Letβs break this down with a simple example: StudentID Department HODName S001 IT Dr. Rajan S002 CS Dr. Priya
StudentID β Department (StudentID determines the department). Department β HODName (Department determines the HOD name). It should be like StudentID only should determine HOD, not the dept. HODName depends indirectly on StudentID through Department.
This is a transitive dependency, and we need to remove it.
A transitive dependency means a non-prime attribute (not part of the candidate key) depends indirectly on the primary key through another non-prime attribute.