Mastering SQL for Data Science: A Comprehensive Guide Day-3
Mastering SQL for Data Science: A Comprehensive Guide Day-3
SQL Functions
1). Group function / aggregate function:
ü Avg: This function returns as an average of the value of n, ignoring null values in a column.
Syntax: avg([<distinct>|<all>]<n>)
E.g: select avg(sal) from emp;
ü Min: This function returns a minimum value of expr.
Syntax: min([<distinct>|<all>]<n>)
E.g: select min(sal) from emp;
ü Max: This function returns a maximum value of expr.
Syntax: max([<distinct>|<all>]<n>)
E.g: select max(sal) from emp;
ü Sum: This function returns the sum of the values of n.
Syntax: sum([<distinct>|<all>]<n>)
E.g: select sum(sal) from emp;
ü Count: This function returns the number of rows where expression is not null.
Syntax: count(expr)
E.g: select count (emp_no) from emp;
ü Count(*): This function returns the number of rows in the table, including duplicates and those with nulls.
Syntax: count(*)
E.g: select count(*) from emp;
2). Scalar functions:
1). String functions:
Many implementations of SQL provide functions to manipulate characters and strings of characters. This section covers the most common character functions.
ü Initcap: This string function is used to capitalize first character of the input string.
Syntax: initcap(string)
E.g: select initcap(‘technology’) from dual;
o/p: Technology
ü Lower:
This function is used to convert input string into the lower case.
Syntax: lower(string)
E.g: select lower (‘TECHNOLOGY’) from dual;
o/p: technology
ü Upper:
This function is used to convert input string into the upper case.
Syntax: upper(string)
E.g: select upper(‘technology’) from dual;
o/p: TECHNOLOGY
ü Ltrim:
This function accepts two string parameters; it will fetch only those set of characters from the first string from the left side of the first string, and displays only those characters which are not present in the second string. If same set of the characters are found in the first string it will display whole string;
Sytanx: ltrim(string,set)
E.g: select ltrim(‘technology’,’tech’) from dual;
o/p: nology
ü Rtim:
This function accepts two string parameters; it will fetch only those characters from the first string, which is present in set of characters in second string from the right side of the first string.
Syntax: rtrim(string)
E.g: select rtrim (‘technolog’,’nology’) from dual;
o/p: tech
ü Trim:
This function removes all specified characters either from the beginning or the ending of a string.
Syntax: trim([Leading|Trailing|Both [<trim_char> from] ]<string>)
E.g: 1) select trim (‘ technology ’) from dual;
o/p: Technology
2). Select trim (Leading ‘x’ from ‘xxxtechnology’) from dual;
o/p: technology
3). Select trim (Trailing ‘x’ from ‘technologyxxx’) from dual;
o/p: technology
4). Select trim(both ‘x’ from ‘xxxtechnologyxxx’) from dual;
o/p: technology
ü Replace:
This function is useful when you want to search a specified string and replace it with particular string from the string provided.
Syntax: replace (string, search_string, replace_string)
E.g: select replace (‘jack and jue’, ‘j’, ‘bl’) from dual;
o/p: black and blue
ü Substr:
This function is fetches out a piece of the string beginning at start and going for count characters. If count is not specified, the string is fetched from start and goes till end of the string.
Syntax: substr(string,start,count)
E.g: select substr(‘technology’,4,6) from dual;
o/p: hnolog
ü Lpad:
This function takes three arguments. The first argument is character string, which has to be displayed with the left padding. Second is a number, which indicates total length of return value and third is the string with which left padding has to be done when required.
Syntax: lpad(string, length,pattern)
E.g: select lpad(‘technology’,15,’*’) from dual;
o/p: *****technology
ü Rpad:
This function does exact opposite then lpad function.
Syntax: rpad(string, length,pattern)
E.g: select rpad(‘technology’,15,’*’) from dual;
o/p: technology*****
ü Length:
When the length function is used in a query, it returns length of the input string.
Syntax: length(string)
E.g: select length(‘technology’) from dual;
o/p: 10
2). Numeric function:
ü Floor:
This function returns the largest integer that is less than or equal to its arguments.
Syntax: floor(n)
E.g: select floor(128.3),floor(129.8) from dual;
o/p: 128 129
ü Ceil:
This function returns the smallest integer that is greater than or equal to its argument.
Syntax: ceil(n)
E.g: select ceil(128.3),ceil(129.8) from dual;
o/p: 129 130
ü Power:
This function returns the value of m raised to the nth power. ‘n’ must be an integer else an error is returned.
Syntax: power(m,n)
E.g: select power(10,2),power(2,16) from dual;
o/p: 100 65536
ü Sqrt:
This function returns the square root of n. If n is less than zero, oracle returns an error. It returns a real result.
Syntax: sqrt(n)
E.g: select sqrt(256) from dual;
o/p: 16
ü Abs:
This function always returns the positive numbers.
Syntax: abs(negative number)
E.g: select abs(-13) from dual;
o/p: 13
ü Greatest:
This function returns the greatest value in the list of expression.
Syntax: greatest(exp1, exp2,……expn)
E.g: select greatest (5,87,6) from dual;
o/p: 87
ü Least:
This function returns the least value in the list of expression.
Syntax: least(exp1, exp2,…..expn)
E.g: select least (5,87,6) from dual;
o/p: 5
ü Mod:
This function returns the remainder of the number divided by second number passed a parameter. If the second number is zero, the result is the same as the first number.
Syntax: mod(number, division value)
E.g: select mod(10,3) from dual;
o/p: 1
3). Date function:
To manipulate and exact values from the date column of a table oracle provides some date functions.
ü Add_months:
This function returns date after adding the number of months specified in the function.
Syntax: add_momths(d,n)
E.g: select add_months (sysdate,4) from dual;
ü Last_day:
This function returns the last date of the month specified with the function.
Syntax: last_day(d)
E.g: select sysdate,last_day(sysdate) ‘last day’ from dual;
ü Months_between:
This function returns number of months between d1 and d2.
Syntax: months_between(d1,d2)
E.g: select months_between(dot,dor) from dual;
ü Next_day:
This function returns the date of the first weekday named by char that is after the date named by date. Char must be a day of the week.
Syntax: next_day(date,char)
E.g: select next_day(sysdate,’Friday’) from dual;
ü Round:
This function returns a date rounded to a specific unit of measure. If the second parameter is omitted, the round function will round the date of the nearest day.
Syntax: round(date,[format])
E.g: select round (to_date(’04-july-17’),’yyyy’) from dual;
ü Systimestamp:
This function returns the current system date and time on your local database.
E.g: select systimestamp from dual;
Ø Group by clause:
Till now, all sql select statements have; retrieved all the rows from tables, retrieved selected rows from the tables with the use of a where clause, which returns only those rows that meet the conditions specified. Other than the above clauses, there are two other clauses, which facilitate selective retrieval of rows. There are the group by and having clause. The group by clause is another selection of select statement. Group by clause is used with group functions only. Normally, group functions returns only one row at a time. But group by clause will group on that column.
The group by cause tells oracle to group rows based on distinct values for specified columns. i.e. it creates a data set, containing several sets of records grouped together based on a condition.
Syntax: select <column_name1><column_name2>group function(<expression>) From <table_name> where <condition> group by <column_name>;
E.g: select dept_no, count(*) from employee group by dept_no;
Ø Having clause:
The having clause can be used in the conjunction with the group by clause. The having clause is used to also satisfy certain conditions on rows, retrieved by using group by clause. Having clause further filters the rows return by group by clause.
E.g: select dept_no, count(*) from employee group by dept_no having dept_no is not null;
Rules for group by and having clause:
- Columns listed in the select statement have to be listed in the group by clause.
- Columns listed in the group by clause need not to be listed in the select statement.
- Only one group function can be used in the group by clause.