Skip to content

Add a DataFrame.show() method pls! #1889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
halleygithub opened this issue Sep 11, 2012 · 14 comments
Closed

Add a DataFrame.show() method pls! #1889

halleygithub opened this issue Sep 11, 2012 · 14 comments
Labels
Enhancement Output-Formatting __repr__ of pandas objects, to_string

Comments

@halleygithub
Copy link

'print df' will give something like below if the dataframe 'df' is big to fit into the screen :

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 41955 entries, (u'000002', u'20061231') to (u'603366', u'20120630')
Columns: 147 entries, STK_ID to EPS
dtypes: float64(135), object(12)

But most of the time, I want to have a glimpse of the data , which help to know what happened to the dataframe.

Can Pandas developers add a 'show()' method to DataFrame object to display part of the data inside ? Namely, show the four corner (up_left, up_right, down_left, down_right) data, and use '...' to represent the omitted part ?

somewhat like :

             STK_ID  RPT_Date STK_Name  ..  OprCF_PS    EPS

STK_ID RPT_Date
000002 20061231 000002 20061231 万科A .. -0.692 0.526
20070331 000002 20070331 万科A .. -0.741 0.140
20070630 000002 20070630 万科A .. -0.454 0.254
............... ............. ............... ........... ... .......... .....
20071231 000002 20071231 万科A .. -1.519 0.705
20080331 000002 20080331 万科A .. -0.207 0.105

@wesm
Copy link
Member

wesm commented Sep 11, 2012

Will keep it in mind-- happily accept a pull request, too, if you get around to it.

@halleygithub
Copy link
Author

Below is the snippet that I currently use, pls be noted that not implement the row-wise function yet (the difficulty is that I don't know how to set/insert a row of '..'

def sw(df,first_rows = 20,last_rows =10,first_cols =3,last_cols =2):
''' display the df (can be dataframe or series) sample data
set_printoptions(max_columns=80,max_rows=30)
A,B (upt)
C,D (downpt)

'''
set_printoptions(max_columns=80,max_rows=30)

df =DataFrame(df) # convert to dataframe if input 'df' is series

ncol=len(df.columns)
nrow=len(df)

if ncol <= (first_cols + last_cols) :
    upt = df.ix[0:first_rows,:]         # screen width can contain all columns
    dowpt = df.ix[-last_rows:,:]
    pall = concat([upt,dowpt])
else:                                   # screen width can not contain all columns
    pa = df.ix[0:first_rows,0:first_cols]
    pb = df.ix[0:first_rows,-last_cols:]
    pc = df.ix[-last_rows:,0:first_cols]
    pd = df.ix[-last_rows:,-last_cols:]

    upt =  merge(pa,pb,how='inner',left_index=True, right_index=True)
    dowpt =  merge(pc,pd,how='inner',left_index=True, right_index=True)
    pall = concat([upt,dowpt])
    pall['..'] = '..'
    pall = __col_seq_set__(pall,['..'],[first_cols])

print "\n*****************************************************************"
print pall
print df.columns
print "row: %d    col: %d"%(len(df),len(df.columns))
print "*****************************************************************\n"
return None

DataFrame.show = sw
Series.show = sw

@changhiskhan
Copy link
Contributor

@halleygithub you want to make this into a PR? You're almost there, just need to add a few test cases. Thanks in advance!

@halleygithub
Copy link
Author

yes, pls feel free to further process as you want. I am a newbie to Pandas & github , not a programmer seriously. Feel good that I can contribute to the package.

@paulproteus
Copy link

It would be great for a contributor to take @halleygithub 's code here, add a test case, and submit it as a pull request.

@dundo4he
Copy link

Where is col_seq_set() ? Couldn't find it with grep -r "col_seq_set" pandas/*

@paulproteus
Copy link

@dundo4he it seems to me that it's a function that @halleygithub wrote and hasn't shared yet.

It's "probably" not too hard to figure out what it was, based on the output @halleygithub provided. Does that seem to be doable? If not, we should figure something else out.

@halleygithub
Copy link
Author

def _col_seq_set(df, col_list, seq_list):
''' set dataframe col_list's sequence of 'df' by seq_list '''
df_col = list(df.columns)
fn_col = [x for x in df_col if x not in col_list]

for i in range(len(col_list)):
    fn_col.insert(seq_list[i], col_list[i])

return df[fn_col]

DataFrame.col_seq_set = _col_seq_set

@paulproteus
Copy link

Thanks, @halleygithub !

I'll just provide the same code with a preformatting tag:

def _col_seq_set(df, col_list, seq_list):                                                                                                                                                                  
    ''' set dataframe col_list's sequence of 'df' by seq_list '''                                                                                                                                          
    df_col = list(df.columns)                                                                                                                                                                              
    fn_col = [x for x in df_col if x not in col_list]                                                                                                                                                      
                                                                                                                                                                                                           
    for i in range(len(col_list)):                                                                                                                                                                         
        fn_col.insert(seq_list[i], col_list[i])                                                                                                                                                            
                                                                                                                                                                                                           
    return df[fn_col]                                                                                                                                                                                      
DataFrame.col_seq_set = _col_seq_set                                                                                                                                                                       

Also, @halleygithub , is it OK if we reuse your code under the same terms as pandas, available at https://github.com/pydata/pandas/blob/master/LICENSE ?

@halleygithub
Copy link
Author

Sure, you can. I will feel good if I can help any. (Sorry for my ugly code :-) )

@halleygithub
Copy link
Author

Oh, you just don't need "_col_seq_set()" at all, it is a function in my application to sort the columns sequence in batch. And in the "show()" method, you only need to put the pall[".."] column at first_cols+1 position intead of "pall = col_seq_set(pall,['..'],[first_cols])".

@dundo4he
Copy link

dundo4he commented Jan 8, 2013

Please correct me if I am wrong. It seems that numpy.ndarray type automatically adjusts to fit the screen if the ndarray is too large. Can we borrow that mechanism?

import numpy as np

data = np.random.rand(100,100)

print data

[[ 0.98734521  0.54738576  0.43711897 ...,  0.11306541  0.22723003
   0.10952995]
[ 0.14806827  0.12672894  0.46958608 ...,  0.10808818  0.43853282
   0.02945122]
[ 0.8642931   0.40443047  0.93839959 ...,  0.70985694  0.99053461
   0.92551388]
 ..., 
[ 0.25710058  0.20474109  0.21222875 ...,  0.90249302  0.89936846
   0.14084486]
[ 0.04801022  0.85745347  0.76647051 ...,  0.85480267  0.23448934
   0.69833225]
[ 0.20308408  0.79021899  0.21764972 ...,  0.88353496  0.83787784
   0.82672697]]

@halleygithub
Copy link
Author

yes, I also notice that . but I dislike numpy default format for

  1. lot of '[' & ']'
  2. no column name & index name (for Pandas dataframe)

@sinhrks
Copy link
Member

sinhrks commented Apr 9, 2016

Closed by #5550

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

6 participants