> If your app needs a relational database then use an in-memory db like hsql or cloud services like Heroku Postgres, RDS, Redis Labs, etc. However one risk with most in-memory databases is that they differ from what is used in production. JPA / Hibernate try to hide this but sometimes bugs crop up due to subtle differences. So it is best to mimic the production services for developers even down to the version of the database.
The idea that you can develop locally or test against a different type of database then what you are (or will be) using in production is silly. I don't just think it "sometimes" is a bad idea, I reject it outright. You should always be using the exact same database (type and version).
For cloud deployments most XaaS providers have cheap or free dev tiers. For local development it's easy to use VMs. For any project that requires external resources (Postgres, Redis, RabbitMQ, etc) I have a VM that gets spun up via Vagrant[1]. Just "vagrant up" and all external resources required by the app should be available.
> So first… Use a build tool. It doesn’t matter if you choose Ant + Ivy, Maven, Gradle, or sbt. Just pick one and use it to automatically pull your dependencies from Maven Central or your own Artifactory / Nexus server. With WebJars you can even manage your JavaScript and CSS library dependencies. Then get fancy by automatically denying SCM check-ins that include Jar files.
Though I agree that it's a good idea to have dependencies externalized, it's not always possible. Some libraries (ex: third party JARs) are not available in public Maven repos and not everyone has a private one setup. In those situations it's fine to check in the JAR into the project itself.
I like the approach outlined by Heroku for dealing with unmanaged dependencies[2]. You declare them normally in your pom.xml but also include the path to a local directory (checked into SCM) with the JAR files. Anybody that clones your repo should be able to build the project with no manual steps. Works great!
LATERAL is awesome. It makes a lot of queries that required sub-select joins much simpler to write and later read.
It's also great for set returning functions. Even cooler, you don't need to explicitly specify the LATERAL keyword. The query planner will add it for you automatically:
-- NOTE: WITH clause is just to fake a table with data:
WITH foo AS (
SELECT 'a' AS name
, 2 AS quantity
UNION ALL
SELECT 'b' AS name
, 4 AS quantity)
SELECT t.*
, x
FROM foo t
-- No need to say "LATERAL" here as it's added automatically
, generate_series(1,quantity) x;
name | quantity | x
------+----------+---
a | 2 | 1
a | 2 | 2
b | 4 | 1
b | 4 | 2
b | 4 | 3
b | 4 | 4
(6 rows)
Usually when I've had to use cross/outer apply it's been to work around overly normalized, and somewhat bad data.
Agreed on range types.. proper enums in T-SQL would be nice too. I'm really liking where PL/v8 is going, and would like to see something similar in MS-SQL server as well.. the .Net extensions are just too much of a pain to do much with. It's be nice to have a syntax that makes working with custom data types, or even just JSON and XML easier.
If PostgreSQL adds built-in replication to the Open-Source version that isn't a hacky add-on, and has failover similar to, for example MongoDB's replica sets, I'm so pushing pg for most new projects.
Maria/MySQL seem to be getting interesting as well. Honestly, I like MS-SQL until the cost of running it gets a little wonky (Azure pricing going from a single instance to anything that can have replication for example). Some of Amazon's offerings are really getting compelling here.
No, you can write queries that are not really possible to express without it. Basically, it allows you to execute a table-valued function for each row in an earlier query.
For example, in SQL Server I find a common use of CROSS APPLY (which appears to be the same thing) is where the "table-valued function" is a SELECT with a WHERE clause referencing the earlier query, an ORDER BY, and a TOP (=LIMIT) 1. (In fact, this is exactly the example given in the article.) It allows you to do things like "for each row in table A, join the last row in table B where NaturalKey(A) = NaturalKey(B) and Value1(A) is greater than or equal to Value2(B)".
That's not true. Anything you can do with LATERAL you can also do with correlated scalar subqueries in the SELECT list. LATERAL simply makes writing these kinds of queries easier and more intuitive.
The syntax for this is pretty horrible, however. And if you want to return more than one column from the subquery, you would have to duplicate the subquery definition for each column, right? Then you'd have to have faith that the optimizer can work out what you meant and reconstruct just a single subquery.
There's no faith required; the planner is guaranteed not to do that. The "normal" way is to create a composite type containing each of the columns you need, and then "unpack" it to separate columns. Horrible? Yeah, but it's possible.
Is it possible with scalar subqueries to perform anything other than a tree of correlation? With CROSS APPLY one can correlate a DAG of subqueries, e.g. a diamond where B and C depend on A, and D depends on B and C.
What if the limits on the lateral subqueries were 2 instead of 1, and they were doing select * instead on select sum() in the outer query? How would you recreate that with correlated SCALAR subqueries? There's no such thing as non-scalar correlated subqueries is there?
SELECT unnest(ar).* FROM
(SELECT ARRAY(SELECT tbl FROM tbl
WHERE .. ORDER BY .. LIMIT 2) AS ar
FROM .. OFFSET 0) ss;
If you want a specific set of columns instead of *, you'd need to create a custom composite type to create an array of, since it's not possible to "unpack" anonymous records.
Looking at the examples from the official documentation, I agree with your sentiment. Indeed, the conciseness can cause some confusion to people familiar with the existing scoping rules.
Most Postgres clients have issues connecting to Amazon Redshift. The wire protocol is the same so basic interactions (ex: connect via psql and run a SELECT) usually works but things get hairy when you start doing more complicated things or even just querying the data dictionary (the basic information_schema is there but Redshift has it's own tables for it's specific features).
My company[1] makes a database client that runs in your web browser that has explicit support for Redshift (as well as Postgres, MySQL, and more). I encourage you to check it out.
> In this trial, XFINITY Internet Economy Plus customers can choose to enroll in the Flexible-Data Option to receive a $5.00 credit on their monthly bill and reduce their data usage plan from 300 GB to 5 GB. If customers choose this option and use more than 5 GB of data in any given month, they will not receive the $5.00 credit and will be charged an additional $1.00 for each gigabyte of data used over the 5 GB included in the Flexible-Data Option.
What informed consumer would agree to this?!
Streaming Netflix uses about 1GB/hour for "good" quality streams and 2+GB/hour for HD streams. That would make watching an hour a night come out to $30-60/month of bandwidth charges. Multiply accordingly if you watch more or have multiple people streaming simultaneously. Oh and the way it's worded it sounds like you lose the $5 discount if you go over 5GB.
I can see this leading to some serious bill shock for anybody that signs up.
This is targeted at low-income consumers who can't afford anything but the lowest-rate plan. Since they will also be the most likely to value an extra $5/month, this is a way for Comcast to milk them for extra money each month when they inevitably go over the 5GB limit. It's a scam.
I don't see how you can say that. We know that around 95% of their overall customers use more than 5 GB, but that's not really relevant since this deal is not being offered to the overall customer base. It is only being offeree to people on the Economy Plus plan, which is the very bottom tier plan (3 mbps).
Generally, heavy data users go for faster plans, and so the very low data users should be disproportionately concentrated in the Economy Plus plan.
I don't understand why most people in the discussion here and on Reddit are overlooking that this is only for Economy Plus customers.
> Informed consumers that know they don't use that much data?
This is part of the problem. It's very difficult for normal customers to understand how much data they're going to use.
When choosing a phone data plan, my dad asked me what plan to go for. How much does google maps use? No idea..
Also, extremely difficult for the customer to dispute the particular data usage, because they don't have access to the audit data that comcast has, and they don't have the technical skill to determine that they really accessed "turnerhd-f.akamaihd.net" when visiting cnn.com anyway. Contrast this with the relative ease of checking phone numbers you don't recognize on your itemized phone bill.
Lastly, most objectionable is this idea of charging penalties for overages. In most industries, if you get more, you pay incrementally less. If I buy a hamburger, soda and fries, I get a discount on the price. If I buy 1000 widgets I get a better price than 10 of them. Comcast and the telcos turn this on its head by charging penalties for overages.
Regulators could force carriers to pro-rate based on the price you paid for your service. For example:
ISP offers 3 packages:
100 GB for $10/mo (overages charged at $0.10/gb)
300 GB for $20/mo (overages charged at $0.06/gb)
750 GB for $30/mo (overages charged at $0.04/gb)
2000 GB for $40/mo (overages charged at $0.02/gb)
EVEN better, offer automatic price brackets... Why make people worry about choosing the right plan. People don't want to make this decision.
First 10GB: $0.10/gb
Next 20GB: $0.07/gb
Next 50GB: $0.05/gb
each GB after that: $0.02/gb
(for illustrative purposes only.. I haven't done the math to see how it compares with current comcast offers)
I don't understand why the telcos think massively penalizing people for going over is a useful pricing model.
Have you ever read your electricity bill? They do just that. Your first 1000kWh cost 0.07$ apiece, your second 1000kWh cost 0.12$, your third 0.14$ (for example)
quite a few of the providers in Texas are exactly the opposite strangely.. they bill a surcharge for using less than some fixed KWh. i.e. I get charged $5 if i use less than ~500KWh in a month.
That's demand pricing, which is the other pricing option.
The two models I am familiar with:
- Normal tiered pricing. Low usage is cheap per kWh; high usage is expensive per kWh. This is because of inflexible production capacity.
- Demand rates. You pay a large fixed price, get a lower kWh price, and (this is key) contractually agree to never exceed X amperes. The higher X is, the higher your fixed price. This is a special plan structured for people who need a lot of power, but only at a modest rate of consumption- think baseboard heaters. Again, this model is constrained by inflexible production capacity.
Not the parent poster, but it works the same way in Thailand. Here in Bangkok first 150 kWh are cheapest, 150-400 are more expensive and 400+ is the most expensive tier.
Is this actually how electricity bills work? Mine has always been included in my rent, but I don't see anything like that on the sample bill for my local utility. Nor does that pricing scheme make any sense to me. Why would the utility charge you more for your second 1000kWh than for your first?
> (grandparent) I don't understand why the telcos think massively penalizing people for going over is a useful pricing model.
It makes no sense to me, either. It's like they don't actually want to sell more bandwidth.
Maybe demanding big penalties from people who accidentally go over is more profitable than actually selling to people who actually want more bandwidth.
British Columbia, where hydroelectricity is cheap and plentiful, has a two-tier structure like that. The cut-off isn't exactly 1000 kWh, of course.
The logic is that the utility company plans for generating a "usual" baseline amount, and peaks and use beyond anticipated "norm" are more expensive since you have to spin up/buy additional capacity. In B.C. at least the tiers are set up so that the first category will cover most "average" use.
Also in many places electricity prices are government-regulated to a greater or smaller extent, and a government might structure pricing like that to try to reduce overall energy use or subsidize the lowest users who might be presumed to be poorest.
But still, the only benefit is you save $5/month. And if you accidentally go over, perhaps when someone links you a video, then you'll be losing a lot more than you might save.
I don't really see any advantages. It would make more sense if they said they'd cut your bill by 50%, or something.
An older acquaintance of mine who pays probably $30 - $40/mo to check yahoo email 4 or 5 times a week. She also watches a couple youtube clips a month; I doubt she uses even a gig of data.
Economy Plus is the cheapest, slowest package. It is only 3 mbps. Comcast describes it as "Ideal for low usage on 1 device" and the list of example uses is "Surf the Web, email, social networking".
I expect that there are plenty of people who actually that that profile and who are using under 5 GB a month, and will find the Flexible-Data Option to be a good deal.
5GB of traffic takes you all of <4 hours on a 3mbps line. So you clicked on that YouTube video your Facebook friend posted but never closed the tab, and it's still buffering, or Google releases that new Android version and your smartphone preloads 400MiB for the update. Left that Spotify playlist running.. there are plenty of not obvious ways to run through that bandwidth in record time, and often at no fault of your own.
I wouldn't even mind billing by traffic, but then they have to play by the rules that other utilities do. There needs to be a clearly established standard for what is counted, when it is counted, where it is counted, and they better make sure to provide an itemized bill for the usage accrued. Otherwise, it's fraud.
Oracle supported both full refresh and incremental refresh as of at least 10 years ago (maybe longer).
Fast refresh requires creating "MATERIALIZED VIEW LOGS" on the source table(s) and covers most (but not all) aggregations/groupings in the mview SQL. Once it's setup, DML to the source tables gets logged to the mview logs and a "fast" refresh of the mview uses it to incrementally update the mview's data.
>> Seemed like a good idea until it dawned on me that this means the passwords are stored as plaintext.
>There are several ways this can be done without that.
>Easiest is if they store the date of the last password change or otherwise know you haven't changed it. If it's old enough, double the plaintext before handing it to the hashing function.
Not quite. What you'd need to do is halve the user's entered password if it's older than the cutoff date. If the user enters "foobarfoobar" then you'd halve it to "foobar" before hashing and comparing it to what's stored.
If you only have the old hashed password stored then you don't have the hash of the doubled password, nor can you can infer it.
This whole approach is silly of course. They should just force everybody to reset their passwords.
The reason to inconvenience rather than force is users in a rush will pick the worst passwords, even as paid employees where their password is the thing between the outside world and highly confidential stuff.
> The old prototype machine had our AWS API access key and secret key. Once the hacker gained access to the keys, he created an IAM user, and generated a key-pair.
Ouch! This is why you must practice the principle of least privilege when provisioning access keys, especially ones that can control your entire infrastructure.
> He was then able to run an instance inside our AWS account using these credentials, and mount one of our backup disks.
Good security practices are like an onion ... it's got many layers and will most likely make you cry.
This is why you should be encrypting your backups. The backup program doesn't even need to be able to read the backups it creates, it only needs to to be able to write them (i.e. the public half of the key).
If you're thinking, "Why does that matter? It can already read the plain-text data it's backing up!" then you're not thinking about history or multiple servers. Each may be able to read the current state of its own file system, but if they only have the public half, they can't read the historical state or each other's backups. The damage would be limited to a single server and only what's live on it at that moment.
> We were able to verify the actions of the hacker using AWS CloudTrail, which confirmed that no other services were compromised, no other machines were booted, and our AMIs and other data stores were not copied.
CloudTrail is awesome and if you're on AWS you really should enable it. It costs peanuts compared to what it provides. In cases like this, where the attacker takes over your entire infrastructure, it may be the only log you have of what happened.
This is pretty neat. I'm a big fan of SQL in general and being able to query system stats like this feels pretty natural to me.
A long time back I created something similar to this atop Oracle[1]. It used a Java function calling out to system functions to get similar data sets (I/O usage, memory usage, etc). It was definitely a hack, but a really pleasant one to use.
Be cool to see an foreign data wrapper for PostgreSQL[2] that exposes similar functionality. I'm guessing it'd be pretty easy to put together as you'd only need to expose the data sets themselves as set returning functions. PostgreSQL would handle the rest. Though I guess that would limit it's usefulness to servers that have PG already installed. Having it be separate like this let's you drop it on any server (looks like it's cross platform too!).
[1]: I don't remember exactly when but I think 10g had just been released.
The linked site doesn't have much content besides a link to the GitHub repo.
EDIT: Even though it's not explicitly listed in the README or contributor list, the commit history does show a bit of detail: https://github.com/iojs/io.js/commits/v0.12