I finally managed to figure out the settings for matplotlib's surface plot that makes a bivariate distribution look more like those in published articles.
The first version uses
ax2 = fig2.add_subplot(111, projection='3d')
surf = ax2.plot_surface(X, Y, Z, rstride=1, cstride=1,
cmap = cm.gray_r, alpha=0.9, linewidth=1)
The second version uses
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, color='0.8',
For example the following shows a mixture of two bivariate normal distributions and the estimate by gaussian_kde. The colored areas are the differences between kde and the true distribution, in the blue areas the kde is too large, in the redish areas the kde is too small. It's a contour plot version for showing that gaussian_kde with default settings for the bandwidth lowers the hills and fills the valleys in the case of bimodal distributions.