The reason is fairly obvious: vector files scale with the number of data-points, while raster files scale with the number of pixels.
There are many potential solutions. The simplest is to rasterize only the large dataset of scatter points using the rasterized=True flag. Thus,
plt.plot(x, y, 'o', alpha=0.1, rasterized=True)
The resulting PDF is much lighter.
 
