lasasstrange.blogg.se

#XTILE STATA SERIAL#
#XTILE STATA CODE#

The differences in the results are attributed to the fact that indexing starts with one in Stata.

The command egen produces equivalent results: egen port = xtile(marketcap), nquantiles(10) by(date) Input str7 date str1 ticker return marketcap Lambda x: pd.qcut(x, 10, labels = False)) You need to use the pandas.qcut() function: In : df = df.groupby().transform( I use monthly time-series data (10+ years), but have included data for 10 tickers over 2 months: df.head() This occurs as companies marketcap change between months and therefore will be situated in different terciles in different months. The result is a number in the port variable for each ticker on each date. The tickers whose marketcap is within the top tercile is coded as a 1, second highest tercile as a 2 and so on, until the tickers whose marketcap is in the lowest tercile is coded as a 10. The xtile function defines the variable we are utilising ( marketcap), while the nquantiles option splits the data into terciles. The above egen command is generating a variable called port, which returns a number from 1 to 10 for each observation that is dependent on the marketcap variable for each date. The xtile() function is part of the community-contributed package egenmore.

#XTILE STATA CODE#

In Stata my code to achieve the desired output is: egen port = xtile(marketcap), nquantiles(10) by(date)

More specifically, I want to create a new column which is called port. Syntax is "vuong mod1 mod2" where mod1 and mod2 are stored regression results.I am a Stata user and trying to replicate some code in Python.

See comments in the ado file for syntax.Ĭomputes Vuong (1989 Econometrica) test of two non-nested regressions as implemented and described in Dechow (1994 Journal of Accounting and Economics). Implementation of Mishkin (1983) rational expectations tests. Updated to include option lroc to report area under ROC curve Shumway (2001) hazard model estimates, which uses a standard logit routine and corrects the chi-squared statistics for the average number of observations per cross-sectional unit. OLS regressions with confidence intervals for ratios of regression coefficients based on Fieller's theorem (more robust than delta method) Updated cgmwildboot to fix p-values (e.g., program looks at upper-tail for p-values of positive coefficients, and this was giving p-values greater than one when the coefficient was in the bottom half of the bootstrap distribution)

Updated cgmwildboot to fix error introduced by prior fix for dropped variables Updated to deal with dropped variables (e.g., fixed effects dropped due to collinearity) Commands include linear regression (cgmreg), linear regression with Fieller (1954) confidence intervals on coefficient ratios (cgmregF), logit (cgmlogit), fractional logit (cgmflogit) and regression with bootstrapped p-values (cgmwildboot - See Cameron, Gelbach and Miller, 2008 Review of Economics and Statistics). Also see Petersen (2009 Review of Financial Studies). Implementation of various estimation commands with multi-way clustered standard errors as in Cameron, Gelbach and Miller (2010 Journal of Business and Economic Statistics).

#XTILE STATA SERIAL#

Baxter-King (1995) bandpass filter for time series.įama-MacBeth (1973) regressions with options to weight by number of observations as proxy for precision of the years' estimates and an option to use a Newey-West correction for serial correlation in coefficient estimates.