How do I delete a pattern in a column using sed command line?


I have a table like this below; "baseMean" "log2FoldChange" "lfcSE" "stat" "pvalue" "padj" "ENSG00000000003.14" 11.3434183210348 0.753849141787545 0.682104979689654 1.10518052826785 0.269081372382168 0.999928163137131 "ENSG00000000419.12" 793.733816508413 -0.256066185652526 0.133681398896401 -1.91549600592503 0.0554292780227467 0.863889514659372 "ENSG00000000457.13" 948.240987147508 -0.088027064401221 0.0869481579436567 -1.01240861776811 0.3113427195966

And I want to delete "" and .X pattern in 1st column. i mean like this; "ENSG00000000003.14" ---->ENSG00000000003 In this case, how can I write command line using sed or whatever?

---------------Answer---------------

Using sed:

$ echo \"ENSG00000009694.13\" 3.25851232080741 0.670268379884225 | sed -E "s|\"(.+?)\.[0-9]*\"|\1|g"
ENSG00000009694 3.25851232080741 0.670268379884225

I guess what you are dealing with is a huge log file. In this case, you can use

$ sed -E "s|\"(.+?)\.[0-9]*\"|\1|g" your_file.txt
"baseMean"  "log2FoldChange"    "lfcSE" "stat"  "pvalue"    "padj"
ENSG00000000003    11.3434183210348    0.753849141787545   0.682104979689654   1.10518052826785    0.269081372382168   0.999928163137131

and sed will print the result. You can also add a -i in front of -E; -i stands for "in place mode" so sed will directly modify your file.

Explanation:

I am using the "find and replace" function of sed. The basic grammar is

sed -E "s|p1|p2|g"

and then sed will replace p1 with p2. -E stand for "extended regex mode" so p1 can be some complicated regex.

My p1 here is (omit ( and ) for now)

\".+?\.[0-9]*\"

in which

  • \" maches the quotation marks,
  • \.[0-9]* maches patterns starting with a dot and have 0 to infinity numbers in the following and
  • .+? maches any patterns in the middle.

Then p2 is simply a \1 which means the string between the first ( and ) pair in p1. And it is done!


Previous : Linux script to see what packages were not downloaded from a folder
Next : Install new version of C-ares on Centos 7