Stata
浏览量
四、程序设计
Stata是一个统计分析软件,但它也具有很强的程序语言功能,这给用户提供了一个广阔的开发应用的天地,用户可以充分发挥自己的聪明才智,熟练应用各种技巧,真正做到随心所欲。事实上,Stata的ado文件(高级统计部分)都是用Stata自己的语言编写的。
由于Stata在分析时是将数据全部读入内存,在计算全部完成后才和磁盘交换数据,Stata也是采用命令行方式来操作,使用上简单,用Stata绘制的统计图形相当精美,很有特色。
一、统计分析
Stata的统计功能很强,除了传统的统计分析方法外,还收集了近20年发展起来的新方法,如Cox比例风险回归,指数与Weibull回归,多类结果与有序结果的logistic回归,Poisson回归,负二项回归及广义负二项回归,随机效应模型等。
三、矩阵运算
矩阵代数是多元统计分析的重要工具, Stata提供了多元统计分析中所需的矩阵基本运算,如矩阵的加、积、逆、 Cholesky分解、 Kronecker内积等;还提供了一些高级运算,如特征根、特征向量、奇异值分解等;在执行完某些统计分析命令后,还提供了一些系统矩阵,如估计系数向量、估计系数的协方差矩阵等。
Stata最突出的特点是短小精悍、功能强大,包含了全部的统计分析、数据管理和绘图等功能。此外,由于Stata在分析时将数据全部读入了内存,在计算全部完成后才和磁盘交换数据,因此运算速度极快。
Stata不仅操作方式简捷,它的数据格式简单,分析结果输出简洁明快,易于阅读,这都使得Stata成为极其适用于统计教学的软件。
Stata许多高级统计模块时编程人员用宏语言写成的程序文件(ADO文件),这些文件可以自行修改、添加和下载。用户可随时到Stata网站搜索并下载最新的升级文件,这特点使得它始终处于统计分析方法发展的最前沿,用户总能很快找到最新统计算法的Stata程序版本,而这也使得Stata成了几大统计软件中升级最多,最频繁的一个。
Broad suite of statistical features
五、New in STATA 16
▶ 以 Lasso 为基础的机器学习
▶ 可重制的报表
▶ 统合分析 (Meta-analysis)
▶ 选择模型
▶ 与 Python 整合
▶ 贝氏分析新增功能:多个马可夫链、贝氏预测…等
▶ 汇入 SAS and SPSS 数据
▶ 无母数序列回归
▶ 加载多个数据集到内存
▶ 信赖区间的样本数分析
▶ 追踪数据混合 logit模型
▶ 非线性 DSGE 模型
▶ 多群组 IRT 模型
▶ 追踪数据 Heckman 选择模型
▶ 包含落后期、领先期、差分的非线性混合效果模型
▶ 可设定图形元素的大小
▶ 数值积分
▶ 线性规划
▶ 全新的Mac界面
▶ Do-file 编辑器新增自动完成功能
Tables
Customize your tables of
✔Summary statistics
✔Results from hypothesis tests
✔Regression results
✔LR and Wald tests, GOF statistics
✔Results from any Stata command
Export to
✔Word, Excel
✔LaTeX
✔HTML, Markdown
✔PDF
Bayesian econometrics
Bayesian
✔VAR models
✔IRF and FEVD analysis
✔Dynamic forecasting
✔Panel/longitudinal-data models
✔Linear and nonlinear DSGE models
PyStata-Python and Stata
✔Call Python from Stata.
✔Call Stata from Python.
✔Exchange data, metadata, and results seamlessly.
✔Use Stata from Jupyter Notebook, Spyder, PyCharm IDE
Jupyter Notebook with Stata
✔Invoke Stata and Mata from Jupyter Notebook.
✔Easily reproduce your work and collaborate with others.
✔Access results from Stata analyses within Python.
✔Stata output, graphs, and tables seamlessly integrate with your Jupyter Notebook.
Difference-in-differences(DID)and DDD models
✔Evaluate the effect of a policy, a treatment, or an intervention.
✔Control for confounding unobserved group and time characteristics.
✔Use panel data or repeated cross-sections.
✔Use DID. In vogue since 1855.
Faster Stata
Stata is fast and keeps getting faster.
✔Faster sort and collapse
✔Faster mixed models
✔Faster estimation commands
✔Faster import delimited
✔And more
Interval-censored Cox model
You want to model time to an event.
But you don't know the exact event times—only the intervals in which events happen.
And you don't want to make parametric assumptions.
Try an interval-censored Cox model.
Bayesian VAR models
You fit your VAR models with var.
You fit your Bayesian regression models with bayes:.
Now fit your Bayesian VAR models with bayes: var.
Bayesian multilevel modeling
Nonlinear, joint, SEM-like, and more.
More multilevel models.
More powerful.
Easier to use.
Multivariate meta-analysis
Do you have multiple effect sizes?
Do they share a common control group?
Do they share the same group of subjects?
Treatment-effects lasso estimation
When you want:
Causal inference, average treatment effects, potential-outcome means, double-robust estimation
And you have:
Many (maybe hundreds or thousands of) potential covariates
Use treatment-effects estimation with lasso variable selection.
Galbraith plots
Graphically summarize meta-analysis results
✔Study-specific effect sizes
✔Precision of effect sizes
✔Overall effect size
Detect potential outliers
Assess heterogeneity
Leave-one-out meta-analysis
Leave-one-out meta-analysis performs multiple meta-analyses by excluding one study at each analysis. It is common for studies to produce exaggerated effect sizes, which may distort the overall results. Leave-one-out meta-analysis is useful to investigate the influence of each study on the overall effect-size estimate and to identify influential studies.
You can now perform leave-one-out meta-analysis by using the new leaveoneout option with meta summarize and meta forestplot.
zero-inflated ordered logit models
Ordered logit regression is used to model ordered categorical responses, such as symptom severity recorded as none, mild, moderate, or severe. Larger values of such ordered outcomes represent higher levels, but the numeric value is irrelevant.
Nonparametric tests for trend
Do responses have an increasing or decreasing trend? Find out using one of four nonparametric tests for trend:
✔Cochran–Armitage test
✔Jonckheere–Terpstra test
✔Linear-by-linear test
✔Cuzick's test with ranks
Stata on Apple Silicon
✔Native M1 processor support
✔Universal application for both Intel and Apple ✔Silicon Macs
✔One license, both kinds of hardware
NEW IN STATA 17