computing values of a latent variable

Is there an easy way to compute the (standaridzed) values of latent variables so that it generates a new variable to the dataset.

I am working with stata. The SEM model builder doesn’t seem to offer that option

Thanks for all adivce!

All topic

Books for statistical computing course?

I’m taking an introductory graduate course in statistical programming. I usually like to read textbooks along with my coursework, but the professor doesn’t have any suggestions for books matching the syllabus. Is there any book that could help me read up on this material?

The basic outline of the class is this: For each topic, the professor demonstrates the underlying math on the blackboard, then writes code to perform the algorithm. He moves very quickly, so I’m hoping there might be a book (or several) I could supplement my notes with. I’m interested in both the math and the algorithms.

Here is the syllabus topics list:

The purpose of this course is to teach the art of statistical
programming in R, Python, and C/C++, by writing computer code to
implement the following core algorithms in statistical computing.

. Least squares regression, sweep operator, QR decomposition

· Eigen computation, Principal Component Analysis

· Logistic regression, Newton-Raphson

· Lasso, coordinate descent, boosting, solution path

· Feed-forward neural network, back-propagation

· EM algorithm, Gaussian mixture, factor analysis

· Random number generators, Monte Carlo integration

· Metropolis algorithm, Gibbs sampling, Bayesian posterior

When going through the above topics, the focus will be on algorithms
and especially programming, instead of theories of learning, inference
and computing.

All topic

Computing $n$-th derivative $f(x)= frac{cos(5x^4)-1}{x^7}$ using MacLaurin Series.

The prompt is to find the $9$-th derivative of the function f(x) defined as,
$$f(x)= frac{cos(5x^4)-1}{x^7}$$
at $x = 0$.

We are suggested to use the MacLaurin Series for $f(x)$.

All topic

Collusion resistant computing of same function by two servers

Is there a secure way of computing same hash function of replicated data by two servers. I mean that both servers host the same version of replicated data and they should output h(M) without colluding, i.e., both of them really compute the specified function.

All topic

Computing nth derivative using MacLaurin Series

The prompt is to find the 9th derivative of the function f(x) defined as,
$${f(x)= frac{cos(5x^4)-1}{x^7}}$$
at x = 0

We are suggested to use the MacLaurin Series for f(x).

All topic

Computing probability of extracting AT LEAST n balls from a urn

I cannot figure out how to model the probability computation in the following scenario.

I have an urn with 31 balls, of which 21 are red and 10 are black. If I extract 10 balls from the urn, what is the probability that AT LEAST 6 are black?

(edit: it’s a bulk extraction, so no replacement or order matter)

I am more interested in how to use binomial formulas than in the numeric result; thanks!

All topic

Best easy and straightforward way of computing McDonald’s omega in a unidimensional test with `psych` in R

I’m preparing a masterclass in Psychometrics for a course, and I would like to introduce McDonald’s omega coefficient as the proper index for assessing the reliability of a scale.

For what I can see, the function omega in package psych is the best way to do it. However, as can be seen in the function help and elsewhere, this function assumes the existence of group factors, implying a certain lack of unidimensionality in the data.

In the example I want to put forth, I am assuming unidimensional scales, which actually have a very low number of items. Being the case that the scale is unidimensional, I assume that all the common variance should be explained by the general (and only) factor, thus not being group factor common variances. In this very precise case, I understand that hierarchical omega ($omega_H$) and total omega ($omega_T$) should be the same.

The problem when using omega with such data is, to my understanding, that there is no simple way to assume scale unidimensionality.
Given this is a class in psychometrics and not about programming in R, I would like to have an easy, straightforward way to compute the index.

I give an example below of my attempts to solve it, using one of the supplementary material datasets of the book “Predictive HR Analytics, (Edwards, & Edwards): file “Chapter 5 RAW Survey Results.sav” at

First attempt: use omega with a “group factor” per item

Using the variables Eng1 to Eng4, I call omega with four group factors:


hr_data <- "dat/Chapter 5 RAW Survey Results.sav" %>% read_sav

hr_data %>% select(starts_with("Eng")) %>%
  omega(nfactors = 4, fm = "gls", cor = "poly")

# Omega
# Call: omega(m = ., nfactors = 4, fm = "gls", cor = "poly")
# Alpha:                 0.9
# G.6:                   0.88
# Omega Hierarchical:    0.9
# Omega H asymptotic:    0.91
# Omega Total            1 
# ... (truncated)

Engagement bifactor model for omega

This solution works because it yields a “group factor” per item, meaning that the general factor accounts for all the common variance, while the group factors actually explain the uniquenessess of each of the items. Thus, all the variance is considered common variance, and $omega_T$ = 1 as returned by the function.
However, in this case, $omega_H$ should be the proper value of $omega_H = omega_T$.

Second attempt: same strategy, different scale

Now, I try the same strategy, with the items pos1 – pos3:

hr_data %>% select(starts_with("pos")) %>%
  omega(nfactors = 3, fm = "gls", cor = "poly")

# Omega 
# Call: omega(m = ., nfactors = 3, fm = "gls", cor = "poly")
# Alpha:                 0.77 
# G.6:                   0.73 
# Omega Hierarchical:    0.73 
# Omega H asymptotic:    0.83 
# Omega Total            0.87 
# ... (truncated)
# Warning message:
# In GPFoblq(L, Tmat = Tmat, normalize = normalize, eps = eps, maxit = maxit,  :
#   convergence not obtained in GPFoblq. 1000 iterations used.

POS bifactor model for omega

Probably due to lack of convergence (see the warning at the end of the output), this does not give a solution with a “group factor” per item. Therefore, the solution fails, proving it is not a universal way of computing $omega$ in my use case.

Alternative solution: writing a function

Using the output from fa, I could write a function myself, sort of:

pos_efa <- hr_data %>% select(starts_with("pos")) %>%
  fa(nfactors = 1, fm = "gls", cor = "poly", correct = 0)

pos_efa$loadings^2 %>% colMeans

(Note: I am assuming that, given I have polychoric correlations, the observed score variances are 1, therefore I promediate the squared factor loadings. However, this does not seem to be the same result as $omega_H$ computed by omega when I use e.g. the Eng1-Eng4 variables, so probably such a function, that computes the “proportion of variance explained” by the common factor, would not even be correct.)

The problem with this approach is that I would like to provide the students with readily available tools (in this case, psych), and avoid cluttering the class with extra R code, functions, and so.

Therefore, the question is: What is the most straightforward way of computing McDonald’s omega in a unidimensional test in R, if any? Otherwise, what would be the proper way of computing its value in the unidimensional case with polychoric correlations (without much coding)?

Thank you so much in advance.

All topic

Computing the double Integral using MonteCarlo techniques using Julia

I decided to try and learn Julia for doing scientific computing, and I decided to tackle the problem of finding

$$ int_{D_{frac{1}{4}}} x^4 + y^2 dA $$

where $ D_{frac{1}{4}} $ is the part of the unit circle in the first cuadrant.

My code in Julia is the following:

using Distributions
e = 10.0^(-3);
p = 0.85;
variance = 4;

N = floor(Int, variance / ((1-p)*((e/2)^2))) + 1

u = Uniform(0,2);
x = rand(N);
y = rand(N);
z = rand(u, N);

result = sum((x.^2 + y.^2 .<= 1) & (z .<= x.^4 + y.^2))*2.0 / N

which gives the nice result $ = 0.2945746303294543 $

I kindly ask for how to improve my implementation, and reduce the footprint of memory (it uses almost 2 to 3gb in RAM).

All topic

Computing the Power Spectrum Density (PSD) on a CSV File in Python

im quit new with signal processing and im trying to calculate the PSD of a signal im sampling. the signal is an output of a DC buck converter

this is the code im using and this is the plot im getting

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab

filename = 'scope_yos2.csv'
data = pd.read_csv(filename)
ConvertToMatrix = data.values
time = np.delete(ConvertToMatrix,[1,2],axis=1)
voltage = np.delete(ConvertToMatrix,[0,2],axis=1)
NumOfSampels= len(voltage)
voltage1 = voltage.transpose()
samplFreq = 1.25e9
Pxx, freqs = plt.psd(voltage1,   

this is the DC Buck Output

this is the output of the PSD

first image is the DCDC ouput
second one is the PSD plot

the csv file contains two columns first is the time ax and the second one is the amplitude

  1. how come the plot is so smooth. this is the plot im getting using the welch algoenter image description here

also theres a argument called **kwargs , what is used for ? , as you can see im not using it
enter image description here


All topic

(python+opencv) haar cascade training using cloud computing

I will train my own cascade, using python and opencv. I want it to be as precise as possible, so as I read it is common that training takes one or two weeks to compile. I want to use cloud computing to not wait that long. I research but cannot find a single guide that explains what to do. I have pozitive and negative images, and a python code. How I can compile my code on cloud with least additional efford? (till now I just use my desktop, just wrote and compile scripts. I have no experience on cloud computing.) I found zero answers when searching opencv,python,cloud computing together. topics like this are for permanent web applications, I guess. I just want to run the code and get results, basically.

All topic