Skip to main content

Tutorial: Using String.find() and String slicing with [n:m] notation to extract data in Python

When you need to extract data from a string in Python, you can use the built-in String.find() method in Python and the String[n:m] notation to extract the sub-string.

string.find(str) will return the starting position of the first instance of str in string. For example. 'abcdecd'.find('cd') will return 2, because the first 'cd' instance starts at position 2 (not 3, since position starts with 0). If no str can be found in the string, the method will return -1.

str[n:m] notation will extract the sub-string starting at position n and ending at position m-1, not including the m-th character. For example, if the content of my_str is 'I don't know', my_str[2:] will return 'don't know'.

By using string.find() and [n:m] notation, we can write a script to automatically extract the data we need from a string.

For example, we have a file test_score.txt that records the scores of all the students enrolled in a class. We know that each line records the student's first name and his or her score of the final as follow.

Peter 100
Anna 99
Henry 98
Jerry 97



If we want to calculate the average of the final, we can use find() and [n:m] notation to achieve this.


#average_sample.py

# used to accumulate the scores
total_score = 0.0

# used to count how many scores (students) we have
number_of_score = 0

# open the file for 'read'
file_h = open('test_score.txt', 'r')

# read the file line by line
for line in file_h:
    # find the marker
    white_space_position = line.find(' ')

    # calculate the position of the score 
    # in relation to position of the marker
    score_position =  white_space_position + 1

    # extract the substring and 
    # convert the substring into a floating point number
    score = float(line[score_position:])

    # print the score
    print 'Score: {}'.format(score)

    # accumulate the scores
    total_score = total_score + score

    # accumulate how many scores we got
    number_of_score = number_of_score + 1

# calculate and display the average
print 'Average: {}'.format(total_score/number_of_score)
 
You will get the following output on the terminal.

Score: 100.0
Score: 99.0
Score: 98.0
Score: 97.0
Average: 98.5

Comments

Popular posts from this blog

Setting MySQL to Use UTF-8 on MAMP (MySQL 5.5.9, or 5+)

I wanted to setup MySQL to use utf-8 on the MAMP installation on my Mac. I tried the instructions from this article: http://cameronyule.com/2008/07/configuring-mysql-to-use-utf-8/ However, I kept getting error messages that are similar to this one [ERROR] /Applications/MAMP/Library/bin/mysqld: unknown variable 'default-collation=utf8_general_ci' I did some search and realized that several variables are deprecated. Reference: http://dev.mysql.com/doc/refman/5.1/en/server-options.html Therefore, I added the following lines into /Applications/MAMP/conf/my.cnf [mysql] character-set-server=utf8 [client] character-set-server=utf8 [mysqld] character-set-server=utf8 collation-server=utf8_general_ci init-connect='SET NAMES utf8' I restarted the server and mysql run successfully with relevant variables being set correctly. In the "Variables" tab under phpMyAdmin interface (ex. http://localhost:8888/MAMP/?language=English) character set c...

線上筆記本、便利貼整理 Online Note Taking Service (Especially Sticky Note) List

Some of the note taking service I have tried! Sticky Note lino it Comment: 精美、除了沒有辦法 download as file + print  之外,應該是這個  list  中的  best choice 中文資料儲存沒有問題,不會變亂碼 無法 double click create note, 但是可以用拖拉方式產生 可以 share, send link, embed, rss, 可用 email post, 無法存檔 無法 double click to edit,但是跳出視窗的速度還 OK、可直接 drag & drop 有 public(group) vs. private 的設定 不能 print(應該說 print 的時候內容不會出現) 有 Task 功能 無法download as file (應該沒有可以的) squareleaf Comment: 比較不  fancy  ,但是簡單可愛 中文資料儲存沒有問題,不會變亂碼 可用 browser print, 但是排版沒有很好 可直接點選編輯(不用等跳出小視窗後再 input) 無法在空白地方 double click 產生新 Note 無法 share, 無法download as file postica Comment: 精簡,稍微制式 中文資料儲存沒有問題,不會變亂碼 點選編輯的時候很慢,反應遲鈍 可以 Print, 但是中文編碼要選 UTF-8 可以 drag & drop 無法 download as file, 無法 share wall wisher Comment: 精美 中文儲存有問題 雙擊 create note, 可以拖拉 每個 note 有 160 character 的限制 文字沒有 Format,無法調整大小, 顏色 可 share, send link, rss fee...

Brackets: a free editor/environment for web development

 There are a lot of options, and VS Code is one of the top contenders. I am a VS Code fan, but if you are looking for an alternative, Brackets is another option that I find appealing. It was built for web development, using HTML/CSS/Javascript. I think it is especially helpful for people who just start learning HTML/CSS (and maybe Javascript). http://brackets.io/ Brackets has some built-in features that are pretty convenient. 1. auto-complete for CSS property and value. 2. Live preview the webpage to reflect the changes being made. You can make changes in code and see the result instantly. 3. In-place editing of CSS rules (you can select an element/class name in HTML and press the short keys to edit the corresponding CSS rules directly). 4. Code to browser mapping: you can select/edit an element in HTML or a rule in CSS, and the corresponding user interface elements or those that will be affected by the CSS rule will be highlighted in the browser. See this video for an overview. Th...